Archive for the ‘VPAC’ Category

VPAC Users Mailing List

Monday, July 28th, 2008

We now have a VPAC users mailing list up and running for people who want a forum for informal discussions about the clusters and software here at VPAC. If you are a VPAC user please feel free to subscribe and join in the conversation!

Grid Systems Adminstrator/Programmer wanted

Monday, July 28th, 2008

We are looking for someone to join the systems team here to work on grid computing, including things like Shibboleth, SSL certificates, Globus and the like.
(more…)

VPAC software updates

Wednesday, July 16th, 2008

After too long a break on the blog, here’s a quick update on software packages that have arrived at VPAC over the past few months, and some information on how to access them.

A sample of the new and updated packages are:

  • Gaussian 03 with Linda (Tango)
  • Gaussian 03 SMP, single node only (Edda)
  • Amber 10 (Tango)
  • MATLAB Distributed Computing Server (Tango)
  • GAMESS-US (Tango)
  • MrBayes (Tango, already on Edda)
  • Dock v6 (Tango and Edda)
  • Schrodinger (Tango)
  • MOPAC 2007 (Tango)
  • AMD Core Maths Libraries - ACML (Tango)
  • Wine 1.0 and 1.1.1 (Tango)

We are introducing controls to licensed commercial software to ensure that users are aware of their rights and obligations when using such packages. To get access to such a package just login to the VPAC website at www.vpac.org with your cluster user name and password and click the “Add Software” button.

New MPI on Tango - please recompile any MPI code you have built!

Thursday, January 10th, 2008

Due to scaling problems we have found in the default version of MPI on Tango (MVAPICH2) we have had to replace it. Jobs of larger than 64 processors would not start.

It has been replaced by OpenMPI v1.2.4 which is the successor to the older LAM-MPI and has full Infiniband support and much better debugging information should a job fail.

We have today switched the default MPI over to this version, so please, if you use MPI programs on Tango, recompile them!

You can see which versions of OpenMPI are available with the command:

module avail openmpi

The current default is for the Portland Group compiler, but should you wish to pick one that uses GCC or the Intel compilers you will need to do:

module load openmpi/1.2.4-gcc

or

module load openmpi/1.2.4-intel

Remember that you will need to insert the same statement into your PBS script before launching your job!

To launch an OpenMPI process from a PBS script just do:

mpiexec ./my-program [arguments]

Tango’s final form unveiled

Monday, October 29th, 2007

Now that the air conditioning, UPS and initial build of Tango is done, and the machine is up and humming along we can reveal that at the end of this year Tango is scheduled to be upgraded to provide even more capacity for our users!

Tango will be upgraded by Xenon Systems to have the new AMD “Barcelona” quad core processors, running at 2.3GHz, which can deliver up to twice the floating point operations per cycle that the current dual core Opterons can. This will mean that Tango will have a total of 736 cores for our users!

We will also be scaling up the RAM and hard disk capacity in each node, RAM will be increased to 32GB (to stay at 4GB per core) and the hard disks will be doubled to a total of 4 x 360GB drives.

VPAC machine room air conditioning work

Monday, October 1st, 2007

We are in the closing stages of the machine room works here at VPAC that are upgrading our facilities to cope with the new large machine, Tango.

On Tuesday 2nd October Tango will not be running jobs to allow the air conditioning engineers to re-plumb our existing air conditioning units which will mean we will be down by one A/C unit through most of the day as they work through the existing 3 units.

Jobs will be permitted back on the system as soon as the work is complete, and once the new A/C unit is operational we will be able to bring the final rack of Tango online!

UPDATE: This work is now complete and jobs are running on Tango again. The new AC unit is waiting on some more electrical work before it is commissioned.

Tango now open for early adopters

Friday, August 31st, 2007

We have now opened our new AMD Opteron cluster, Tango, for early adopters to experiment with - please feel free to compile code and run short jobs on it.

We can only bring up 2 racks of the machine at present until the new air conditioning unit is installed in early September, so at the moment there are just 230 CPUs available. Another 130 will be added once that work has been done.

At the end of the year the cluster will be upgraded to be in excess of 500 processors!

Please note:

  • Compute nodes may be shut down at any time if problems are found.
  • The head node will need to be shut down briefly on Monday for some new hardware.
  • Jobs are limited to 1 day whilst in experimental mode.
  • This is a 64-bit system, please recompile your code!
  • We have lifted the maximum number of CPUs a single user can use to 128.

Tango has:

  • Portland Group compilers - pgcc, pgCC, pgf77, pgf90.
  • MVAPICH2 for MPI (using Portland Group compilers & Infiniband Interconnect).
  • mpiexec set up to start MVAPICH2 jobs by default.

If you find *any* problems with using Tango please please please let us know ASAP - email help (at) vpac.org with as much as you can tell us about how something went wrong.

We are still in the process of installing software & libraries, if you find something you need missing please let us know so we can prioritise it.

The Brecca Tango swap

Friday, August 17th, 2007

Over the past two days we’ve uncabled Brecca & moved it out of the way ready for it to be removed to its new home at Monash, and have moved two of the Tango racks into its place. Xenon Systems are currently finishing off the power cabling for Tango and we should be able to start commissioning the new cluster in the next week.

Here is a photo of those two Tango racks, sitting where Brecca once stood.

Tango stands in Brecca’s former location.

New UPS in service

Thursday, August 16th, 2007

Today Chloride Hydride commissioned the new UPS that was delivered on Tuesday and so the VPAC machine room is again running on UPS power.

Chloride Hydride 80-Net 120kVA UPS at VPAC.

The unit is a 120kVA unit with a battery life of around 20 minutes at full load.

Edda status update - jobs running again!

Tuesday, August 14th, 2007

After a very long, and very busy, day we have Edda up and running again.

We are a few nodes down due to various issues, but currently there are 41 compute nodes available giving a total of 164 CPUs for jobs.

Don’t forget, some nodes are reserved for jobs of 8 hours or less (between 8am-8pm) and one node is reserved for test jobs of less than 15 minutes (between those same times).