gqsub: Lining up new grid users

Tue 6 Oct 2009

Stuart Purdie of The University of Glasgow (and GridPP) came runner up in the poster session at this year's EGEE conference. His work updates the popular qsub command for a grid-enabled world making it easier for users to migrate from local batch systems to global grid resources.

Grid computing is seen as the domain of big science; giant experiments and massive number crunching. What if you're not working on these huge scales? What if your work is too big for a local cluster but "too small for the grid"? Welcome to "mesoscale" computing, that middle ground between the applications that uses 10 minutes on your desktop and 100 hours on the grid.

Researchers who fall into this "mesoscale" area are usually well looked after by local resources. However for various reason, maybe a high demand from other users or just investigating something out of the normal, a user may want to move on to the grid. This move can be tricky when the user is faced with another interface they have to learn or a completely different way of getting their work done.

So up steps the team at the University of Glasgow with gqsub; a grid interface based on the qsub command. qsub is found on many *nix systems and used by a lot of scientists to submit jobs to their local batch system. Creating a grid enabled version makes it easier encourage new users to try out the grid.

It all started when the electrical engineers in the University of Glasgow started trying to use the grid alongside their local cluster, the two interfaces were just too dissimilar for them. So they asked Stuart Purdie of the GridPP team at Glasgow about getting direct access to the Glasgow grid cluster but bypassing the grid. Their preferred method of accessing these resources? qsub.

Giving direct access is not feasible so Stuart started looking around the lesser known parts of gLite. What he found was that there are features that can be used to give the illusion of a shared filesystem just like in a batch system. It was this which led him to start looking at qsub in greater detail. Despite its age (it was designed over 2 decades ago) qsub included the ability to talk to remote servers and the correct components for describing a grid-type system. He now had two halves to a solution he just had to weld them together.

Nothing runs that easily however and there were a few hurdles to overcome. The major problem was the fact that while qsub understood distributed resources it didn't understand these being decentralised. This coupled with needing to integrate digital certificates led to a lot of reading specs, thinking and coding.

The result was a set of python scripts, that act as a translation layer between the user and the gLite software. Stuart was a little taken aback by the interest in his work "At the conference, the first email I got about it was on the Monday morning, before the conference had even really begun. The ultimate intent is to provide something that can be installed on an existing cluster, so that the users can use both local and Grid systems with as small as possible mismatch between them".

"gqsub was only in the ideas stage a few months ago. The current prototype, implemented over the summer, provides enough functionality for those users who find the transition to the Grid daunting. It's a great example of how to recognise a problem and implement a technical solution to the benefit of everyone in EGEE in a short period of time. Congratulations Stuart!" said Tony Doyle, GridPP Technical Director.

The software has a proven track record the electrical engineers at Glasgow have been using it for a couple of months now and is freely available for use. There is still work to be done with improved results collection for users who may not be online all the time and increased control over the job once submitted to the grid.

You can find more information on gqsub and download it here:
http://www.scotgrid.ac.uk/gqsub/

Stuart's poster on gqsub can be found here:
http://indico.cern.ch/materialDisplay.py?contribId=162&sessionId=137&materialId=poster&confId=55893


© Copyright GridPP
If you wish to reproduce this piece please credit GridPP and contact Neasan O'Neill to say you are using it