Care to share? Grid computing on a general-purpose cluster

Mon 19 Jul 2010

Edinburgh Compute and Data Facility (ECDF) is a large local computing resource for hundreds of researchers at Edinburgh University engaged in pursuits from across the academic spectrum such as analysing brain scans to understanding mental illness and exploring the dynamics of complex chemical systems. The diverse user base brings a broad range of requirements that need to work happily together and GridPP are one of the more challenging users.

Across their facilities ECDF uses a fair-share system that allows all users to take advantage of the significant resources available. These currently include around 1500 CPU cores, but the site has recently installed the first of two upgrades, each set to double the facility’s computing power. This is all supported by a large amount of high performance fibre channel disk storage utilising the resilient General Parallel File System, which itself is backed up by several hundred terabytes of bulk storage.

For most grid sites in the world there is an expectation that a whole site is dedicated to grid use and that the same team manages the grid-specific middleware and cluster. For ECDF this is not the case and the differences from the common setup leads to many technical and logistical challenges. These have been overcome by building relations between Physics and e-Science researchers, and the systems team who manage the facility. The team at Edinburgh University have managed to devise solutions that provide a consistent grid service and the lessons-learnt and developments in grid-middleware mean that other sites can get in on sharing resources that would otherwise be limited to local users.

The advantages? Well when university researchers were relaxing over Christmas, the computing cluster was not empty, more than half of it churned away with simulation for the ATLAS LHC experiment and other grid work. Similarly, when there’s a lack of grid jobs, there’s no wasting of resources as the local users eat up the slack. Furthermore, there’s a greater economy of scale in sharing of expertise and potential access to new technologies yet to be fully exploited by the Grid. So with more of these large shared facilities around, and increasing data from LHC and elsewhere to process – let’s get sharing!

For more information contact:
wbhimji@staffmail.ed.ac.uk
awashbro@staffmail.ed.ac.uk
Or see here:
http://www2.ph.ed.ac.uk/particle-physics-experiment/
http://www.wiki.ed.ac.uk/display/ecdfwiki/Eddie+and+the+ECDF


© Copyright GridPP
If you wish to reproduce this piece please credit GridPP and contact Neasan O'Neill to say you are using it