Archive for the ‘Clusters’ Category

VPAC software updates

Wednesday, July 16th, 2008

After too long a break on the blog, here’s a quick update on software packages that have arrived at VPAC over the past few months, and some information on how to access them.

A sample of the new and updated packages are:

  • Gaussian 03 with Linda (Tango)
  • Gaussian 03 SMP, single node only (Edda)
  • Amber 10 (Tango)
  • MATLAB Distributed Computing Server (Tango)
  • GAMESS-US (Tango)
  • MrBayes (Tango, already on Edda)
  • Dock v6 (Tango and Edda)
  • Schrodinger (Tango)
  • MOPAC 2007 (Tango)
  • AMD Core Maths Libraries - ACML (Tango)
  • Wine 1.0 and 1.1.1 (Tango)

We are introducing controls to licensed commercial software to ensure that users are aware of their rights and obligations when using such packages. To get access to such a package just login to the VPAC website at www.vpac.org with your cluster user name and password and click the “Add Software” button.

Tango’s final form unveiled

Monday, October 29th, 2007

Now that the air conditioning, UPS and initial build of Tango is done, and the machine is up and humming along we can reveal that at the end of this year Tango is scheduled to be upgraded to provide even more capacity for our users!

Tango will be upgraded by Xenon Systems to have the new AMD “Barcelona” quad core processors, running at 2.3GHz, which can deliver up to twice the floating point operations per cycle that the current dual core Opterons can. This will mean that Tango will have a total of 736 cores for our users!

We will also be scaling up the RAM and hard disk capacity in each node, RAM will be increased to 32GB (to stay at 4GB per core) and the hard disks will be doubled to a total of 4 x 360GB drives.

VPAC machine room air conditioning work

Monday, October 1st, 2007

We are in the closing stages of the machine room works here at VPAC that are upgrading our facilities to cope with the new large machine, Tango.

On Tuesday 2nd October Tango will not be running jobs to allow the air conditioning engineers to re-plumb our existing air conditioning units which will mean we will be down by one A/C unit through most of the day as they work through the existing 3 units.

Jobs will be permitted back on the system as soon as the work is complete, and once the new A/C unit is operational we will be able to bring the final rack of Tango online!

UPDATE: This work is now complete and jobs are running on Tango again. The new AC unit is waiting on some more electrical work before it is commissioned.

Tango now open for early adopters

Friday, August 31st, 2007

We have now opened our new AMD Opteron cluster, Tango, for early adopters to experiment with - please feel free to compile code and run short jobs on it.

We can only bring up 2 racks of the machine at present until the new air conditioning unit is installed in early September, so at the moment there are just 230 CPUs available. Another 130 will be added once that work has been done.

At the end of the year the cluster will be upgraded to be in excess of 500 processors!

Please note:

  • Compute nodes may be shut down at any time if problems are found.
  • The head node will need to be shut down briefly on Monday for some new hardware.
  • Jobs are limited to 1 day whilst in experimental mode.
  • This is a 64-bit system, please recompile your code!
  • We have lifted the maximum number of CPUs a single user can use to 128.

Tango has:

  • Portland Group compilers - pgcc, pgCC, pgf77, pgf90.
  • MVAPICH2 for MPI (using Portland Group compilers & Infiniband Interconnect).
  • mpiexec set up to start MVAPICH2 jobs by default.

If you find *any* problems with using Tango please please please let us know ASAP - email help (at) vpac.org with as much as you can tell us about how something went wrong.

We are still in the process of installing software & libraries, if you find something you need missing please let us know so we can prioritise it.

The Brecca Tango swap

Friday, August 17th, 2007

Over the past two days we’ve uncabled Brecca & moved it out of the way ready for it to be removed to its new home at Monash, and have moved two of the Tango racks into its place. Xenon Systems are currently finishing off the power cabling for Tango and we should be able to start commissioning the new cluster in the next week.

Here is a photo of those two Tango racks, sitting where Brecca once stood.

Tango stands in Brecca’s former location.

Edda status update - jobs running again!

Tuesday, August 14th, 2007

After a very long, and very busy, day we have Edda up and running again.

We are a few nodes down due to various issues, but currently there are 41 compute nodes available giving a total of 164 CPUs for jobs.

Don’t forget, some nodes are reserved for jobs of 8 hours or less (between 8am-8pm) and one node is reserved for test jobs of less than 15 minutes (between those same times).

Edda status update - login node up

Tuesday, August 14th, 2007

We have successfully reinstalled the login node for Edda, you can now access edda.vpac.org and queue jobs.

We are reinstalling compute nodes now, once we are happy with the state of them we will start jobs running on them!

Thank you for your patience on this.

Edda status update

Tuesday, August 14th, 2007

We are now able to talk to the hardware management device from the cluster management system.

We are now able to finish off updating the nodes firmware and migrating them to using the new authentication system and storage.

Edda status update

Monday, August 13th, 2007

Sadly Edda is still down due to an incompatibility between the firmware on the management device and the cluster management software. We are downloading the previous version of the firmware for the management device and will downgrade to that version tomorrow morning. We hope that this will let us power up the Edda nodes again.

Apologies for the inconvenience!

Wexstan back online

Monday, August 13th, 2007

Wexstan is now back online - we are working on some internal issues with Edda at the moment.