FAQs
Various FAQ and other links can be found on the GGUS FAQ page. This page covers a few topics which are hard to find in other places.
- How does an RB decide where to send jobs?
- How can I target jobs at particular sites?
- How can I decide which RB to use?
- How can a job locate scratch space?
- What can a job expect to be installed on a WN?
- How can I renew a proxy used by an automated process?
How does an RB decide where to send jobs?
A popular misconception is that the Resource Broker has a built-in algorithm to decide where to send jobs. In fact the job distribution algorithm is determined by the user via the Requirements and Rank expressions (and also any data requirements) in the JDL. All CEs are filtered against the Requirements, and the Rank is calculated for all the CEs which match. The job will go to the CE with the maximum Rank, with the following caveats:
- If several CEs have the same maximum Rank, the job is sent randomly to one of them. Thus specifying "Rank = 1" will send jobs randomly to all matching CEs.
- The JDL has a "fuzzy rank" option, see the manual for details. In this case the randomisation is extended over more of the highest-ranking CEs.
- Some extra Requirements are added automatically: the CE must allow access to the selected VO, and the batch system behind the CE must be accepting and running jobs.
- VOs can also add their own restrictions to the sites considered for matching via the Freedom of Choice for Resources tool.
The edg-job-list-match command has a --rank option which will show the calculated Rank value for each matching CE. Bear in mind however that the Rank is dynamic and may change on a short timescale.
If the JDL file does not contain a Rank expression, a default expression is used. This is taken from the configuration file on the UI, typically stored in /opt/edg/etc/edg_wl_ui_cmd_var.conf. The standard default expression is "Rank = - other.GlueCEStateEstimatedResponseTime;", which is supposed to run the job at the CE which will start executing the soonest. Unfortunately it is very difficult in practice to calculate a good value for this, and in the past the algorithms used have had some pathologies which have sometimes resulted in a very had distribution of jobs. LCG is in the process of introducing a better algorithm.
Note that even with a good calculation the EstimatedResponseTime will be 0 for all CEs which are able to run jobs immediately, in which case jobs will simply be distributed randomly across them. Also note that there is a finite time (up to a few minutes) for the information system to update.
How can I target jobs at particular sites?
The edg-job-submit command has an option -r which can be used to send a job directly to a particular CE (specified by a CEID, which is the long string printed by edg-job-list-match). However, note that this bypasses the broker completely, which among other things means that the BrokerInfo file will be missing.
A CEID can also be specified as a Requirement in the JDL, and this can also be used with a wildcard specification, e.g. to restrict jobs to sites in the UK use:
Requirements = RegExp(".ac.uk", other.GlueCEUniqueId);
For a more long-term solution the best way is to have sites advertise a suitable RunTimeEnvironment string, and use that as a Requirement. For example, some GridPP sites are publishing an RTE of "GRIDPP", which can be matched with:
Requirements = Member("GRIDPP",other.GlueHostApplicationSoftwareRunTimeEnvironment);
How can I decide which RB to use?
The RB used for job submission is defined in a VO-dependent configuration file, which is normally called /opt/edg/etc/dteam/edg_wl_ui.conf (replace "dteam" with the name of your VO). This file can be overridden with the --config-vo option to the job submission commands if you want to supply your own version (or define an alternative file in the EDG_WL_UI_CONFIG_VO variable).
The relevant lines in the file look like:
NSAddresses = "lcgrb01.gridpp.rl.ac.uk:7772"; LBAddresses = "lcgrb01.gridpp.rl.ac.uk:9000";There are two lines because the RB consists of two components, a Network Server to receive the JDL and a Logging & Bookkeeping server to manage the job logging, but these are normally on the same machine. The machine named above is a production RB at RAL, which can be used as a general-purpose broker by anyone in GridPP (and indeed elsewhere).
It is possible to specify multiple RBs in the config file. This follows the usual classad syntax as used in the JDL, e.g.
NSAddresses = { "lcgrb01.gridpp.rl.ac.uk:7772", "lcgrb02.gridpp.rl.ac.uk:7772" };
LBAddresses = { { "lcgrb01.gridpp.rl.ac.uk:9000" }, { lcgrb02.gridpp.rl.ac.uk:9000 } };
The reason for the doubled curly brackets in the LB specification is that this is in fact a list of lists, each NS in the first line can be matched against a list of LBs in the second line. However, usually you should just match them in pairs.
Having a list allows some degree of failover, if one fails another will be tried. However, there are some caveats:
-
The failover algorithm is not totally effective, sometimes the submission can fail part of the way into the transaction with the NS and then be aborted without a retry.
There is a second non-VO-specific configuration file, usually /opt/edg/etc/edg_wl_ui_cmd_var.conf, which can be overridden with the --config option to the relevant commands. This sometimes also contains a broker name in the LoggingDestination parameter, which can cause problems if the specified machine is down. This can safely be removed from the file as the default is normally suitable.
Once a job is submitted you are committed to the NS and LB chosen for that job, there is currently no failover of the RB itself. Hence if a broker fails, jobs it manages will at least be unmanageable until it recovers, and may be lost completely if the failure is sufficiently bad.
If you are using a MyProxy server for proxy renewal with long-lived jobs, you must choose brokers which are trusted by the MyProxy server. Usually these are in the same domain, for example the RAL MyProxy lcgrbp01.gridpp.rl.ac.uk trusts the RAL brokers named above. Unfortunately, the JDL currently allows only one MyProxy server to be specified, which limits the brokers you can use.
The final question is how to know which brokers to use. Unfortunately there is no completely definite answer to this. The RAL brokers above should be usable by anyone in GridPP and are a useful default, but may not always be available so a backup is useful. Some VOs may have specific RBs allocated to their users, in which case you should consult their documentation or ask for help within the VO. Otherwise, the CIC portal can give you a list of resources: follow the instructions to search for production Grid services for your VO, and then look in the "RB" column. This will generally give you a long list, and in principle you should be able to use any of these brokers. However, in general it's better to use brokers which are in some sense local, e.g. UK Atlas members should look for brokers in the UK or at CERN for routine use. The next best choice would be brokers at major centres, e.g. CNAF, DESY or NIKHEF.
How can a job locate scratch space?
The posix standard defines an environment variable TMPDIR which should point to a suitable scratch area - there is no definition for how large this should be but it's likely to be at least several Gb, typically on a disk local to the Worker Node. However, there is no guarantee that this variable will be set at every site. If it is not set the job should write into whatever working directory is defined at the start of execution. /tmp should not be used except for small files. No area, including the home directory of the account under which the job runs, should be assumed to be persistent between jobs; for persistent files use a Storage Element.
What can a job expect to be installed on a WN?
In general terms a job will find a "standard" environment for whatever OS runs on the WN, currently this is generally Scientific Linux or compatible. However, there is no particular guarantee as to what exactly will be installed; some sites install essentially the full Linux distribution by default, others only a minimal set. For mainstream use VOs should validate sites before use, publishing a tag which can be selected in the JDL so that jobs only go to suitable sites.
If sites are found to be missing some components which are required there are basically three possible approaches. The simplest is to ignore the site and send jobs elsewhere. Secondly, GGUS tickets can be opened to request that the site installs the missing package(s). Thirdly, the VO can incorporate the missing component(s) into its own VO-specific software.
How can I renew a proxy used by an automated process?
The security rules say that you should not create a proxy with a lifetime greater than 24 hours, but this may not be sufficient if you have an automated process which e.g. needs to run over a weekend. The solution to this is to store a long-lived proxy in a MyProxy server and regularly retrieve a short-lived proxy from it, e.g. with a cron job. This is a very simple process, but the usage is somewhat different to the standard use of MyProxy with proxy renewal in the WMS. The page on using MyProxy gives a summary of the basic principles.
Last modified Wed 24 December 2008 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3