Grid Services
Jobs can be submitted to grid resources using the ARC tools, which are available in
CVMFS. Our colleagues in Durham have written a good
introductory tutorial; a summary of the steps required to submit and manage jobs, adapted for Glasgow users, is given below.
ARC tools
The tools required for grid job submission and management are available from
CVMFS:
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'
setupATLAS
lsetup emi
If you plan to submit jobs to the ScotGrid VO, at present you must also amend the
X509_VOMSES
environment variable as follows:
export X509_VOMSES=/etc/vomses
Certificates and proxies
To use grid resources, you will need a
certificate, from which you can generate a proxy certificate. The proxy certificate has a relatively short lifetime, and is used to actually submit the job. A proxy is associated with a particular Virtual Organisation (VO), for example
vo.scotgrid.ac.uk
, which is selected when it is created. You can generate a proxy using the
arcproxy
command:
$ arcproxy -S <VO_ALIAS> -N
For example, to generate a proxy for the
vo.scotgrid.ac.uk
VO:
$ arcproxy -S vo.scotgrid.ac.uk -N
Your identity: /C=UK/O=eScience/OU=Glasgow/L=Compserv/CN=bugs bunny
Contacting VOMS server (named vo.scotgrid.ac.uk): voms.gridpp.ac.uk on port: 15509
Proxy generation succeeded
Your proxy is valid until: 2018-01-19 23:36:59
Job description (xRSL)
Before submitting a job, you need to create a file which describes the features of the job for ARC (its executable, the names of input and output files, what to do with logs, etc.). This file is written in the Extended Resource Specification Language (xRSL). A simple job description which runs a script called
test.sh
could look like this:
&
(executable = "test.sh")
(arguments = "")
(jobName = "TestJob")
(stdout = "stdout")
(stderr = "stderr")
(gmlog = "gmlog")
(walltime = "60")
A full description of xRSL can be found in the ARC reference manual:
http://www.nordugrid.org/documents/xrsl.pdf
Submitting a job
Jobs are submitted to a Compute Element (CE). The ScotGrid site at Glasgow has four CEs:
ce01.gla.scotgrid.ac.uk
ce02.gla.scotgrid.ac.uk
ce03.gla.scotgrid.ac.uk
ce04.gla.scotgrid.ac.uk
It does not matter which CE you choose to submit to. (If you've looked at the tutorial linked above, you'll see that Durham gave their CEs the sensible names
ce1
,
ce2
, etc. We thought that would be too easy.)
Jobs are submitted using the
arcsub
command:
$ arcsub -c <CE_HOSTNAME> <XRSL_FILENAME>
For example, to submit
test.xrsl
to
ce03
at Glasgow:
$ arcsub -c ce03.gla.scotgrid.ac.uk test.xrsl
Job submitted with jobid: gsiftp://ce03.gla.scotgrid.ac.uk:2811/jobs/NCKLDmEQkwrnZ4eC5pmRAbBiTBFKDmABFKDmpMFKDmABFKDmQffBxy
When a job is submitted successfully, you will be presented with its job ID which can be used to refer to the job later. Information about submitted jobs is also recorded in a job list file; by default, this file is
~/.arc/jobs.dat
(
~/.arc/jobs.xml
with some earlier versions of ARC), but you can choose a different location by supplying the
-j
argument to
arcsub
:
$ arcsub -j <JOBLIST_FILENAME> -c <CE_HOSTNAME> <XRSL_FILENAME>
For example:
$ arcsub -j test.dat -c ce03.gla.scotgrid.ac.uk test.xrsl
Querying the status of a job
You can obtain information about the status of jobs using the
arcstat
command:
$ arcstat <JOB_ID>
For example, to obtain information about the job submitted in the previous step:
$ arcstat gsiftp://ce03.gla.scotgrid.ac.uk:2811/jobs/NCKLDmEQkwrnZ4eC5pmRAbBiTBFKDmABFKDmpMFKDmABFKDmQffBxy
Job: gsiftp://ce03.gla.scotgrid.ac.uk:2811/jobs/NCKLDmEQkwrnZ4eC5pmRAbBiTBFKDmABFKDmpMFKDmABFKDmQffBxy
Name: StageJob
State: Queuing
Status of 1 jobs was queried, 1 jobs returned information
You may have to wait a few minutes after submitting a job before status information becomes available.
You can also query the status of all jobs in a job list file:
$ arcstat -j <JOBLIST_FILENAME>
Retrieving job output
Output and log files for a job can be retrieved using the
arcget
command. As when querying the status of a job, you can use either a job ID or a job list file with this command:
$ arcget <JOB_ID>
$ arcget -j <JOBLIST_FILENAME>
For example, to get the output of the job submitted above:
$ arcget gsiftp://ce03.gla.scotgrid.ac.uk:2811/jobs/NCKLDmEQkwrnZ4eC5pmRAbBiTBFKDmABFKDmpMFKDmABFKDmQffBxy
Results stored at: p6vLDmj3kwrnZ4eC3pmXXsQmABFKDmABFKDm9pFKDmABFKDmtVM1wm
Jobs processed: 1, successfully retrieved: 1, successfully cleaned: 1
You will only be able to retrieve job output once the job has finished.
Copying input and output files ("staging")
You can tell ARC to copy input and output files to and from the compute element by including additional attributes in your xRSL file:
&
(executable = "test.sh")
(arguments = "")
(jobName = "TestJob")
(inputFiles = ("input.dat" ""))
(outputFiles = ("output.txt" "")
("results.tgz" "")
)
(stdout = "stdout")
(stderr = "stderr")
(gmlog = "gmlog")
(walltime = "60")
Files used in the
exectuable
,
stdout
and
stderr
attributes are transferred automatically, but other files should be listed in the
inputFiles
or
outputFiles
attribute as necessary. The
inputFiles
and
outputFiles
attributes each take one or more values like this:
("<FILENAME>" "<URL>")
Where
<URL>
is left blank, ARC transfers the file to or from the submission machine (this would be the case for
input.dat
,
output.txt
and
results.tgz
in the example xRSL above). Alternatively, a URL may be provided to copy the file to or from a remote resource:
("index.html" "http://www.example.org/index.html")
("rabbits.zip" "ftp://ftp.example.org/rabbits.zip")
("values.dat" "gsiftp://gridftp.example.org/data/values.dat")
Various protocols are supported, including Rucio and SRM, and details can be found in the ARC reference manual:
http://www.nordugrid.org/documents/xrsl.pdf
Due to the slightly convoluted path files follow to make their way from the submission machine through the CE to the compute node, it is easiest to avoid using paths when specifying
outputFiles
. Instead, if files are created in subdirectories, it may be simpler to copy these files back to
$HOME
at the end of the script (this is a working directory belonging to your job, and is not related to your home directory). You may also wish to add multiple output files or directories to an archive, in order to simplify the process of retrieving results further.