Difference: RunningGangaWithPanda (12 vs. 13)

Revision 132010-06-11 - NickEdwards

 -- ThomasDoherty - 2009-10-26
 Using Ganga to submit jobs to the Panda backend on lxplus
 . Execute your Ganga job script while Ganga is running (where an example of what the 'pandaBackend_test.py' would look like is below in other words have this file in your run directory) and type:
-<
<
+    execfile('pandaBackend_test.py')
->
>
+    execfile('pandaBackend_test.py')
 or simply from the command line run ganga with the name of the Ganga JO appended:
-<
<
+For the LCG() backend you might also need

j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] 

which should match exactly the output file name from your jobs.

Also for the LCG() backend, a site can be specified:

j.backend.requirements.other=['other.GlueSiteUniqueID=="UKI-SCOTGRID-GLASGOW"']
->
>
+For the LCG() backend you might also need
 j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] 
 which should match exactly the output file name from your jobs.
-<
<
+To submit to the UK Cloud add

j.backend.requirements.cloud='UK'
->
>
+Also for the LCG() backend, a site can be specified:
 j.backend.requirements.other=['other.GlueSiteUniqueID=="UKI-SCOTGRID-GLASGOW"']
-<
<
->
>
+To submit to the UK Cloud add
 j.backend.requirements.cloud='UK'
-<
<
+NOTE: Line 3 is an example of overriding a database release to match the one needed to read ESD/DPD. In the case of the spring cosmic reprocessing,the DB release is 6.6.1.1. If the database releases don't match the jobs fail on the Grid ( remove this line if it is not necessary). Line 4 corresponds to your Athena jobOptions. Line 5 is set to False because we have already compiled the packages locally if you want your job to compile 
your checked out code before submitting then simply change this to True
Line 6 tells Ganga to tar your user area and send it with the job.
Line 10 specifies the backend to which you are sending your job. There are three options: LCG, 
Panda and NorduGrid. In the example above Panda was chosen because the data existed only in BNLPANDA,
a site in the US cloud. 
Line 12 corresponds to the number of subjobs you want to split your job into. 
Finally in Line 13 you submit your job
->
>
+</pre>
->
>
+NOTE: Line 3 is an example of overriding a database release to match the one needed to read ESD/DPD. In the case of the spring cosmic reprocessing,the DB release is 6.6.1.1. If the database releases don't match the jobs fail on the Grid ( remove this line if it is not necessary). Line 4 corresponds to your Athena jobOptions. Line 5 is set to False because we have already compiled the packages locally if you want your job to compile your checked out code before submitting then simply change this to True Line 6 tells Ganga to tar your user area and send it with the job. Line 10 specifies the backend to which you are sending your job. There are three options: LCG, Panda and NorduGrid. In the example above Panda was chosen because the data existed only in BNLPANDA, a site in the US cloud. Line 12 corresponds to the number of subjobs you want to split your job into. Finally in Line 13 you submit your job
-<
<
+The Ganga output looks something like this (Note the output is a dataset:  
Output dataset user09.chriscollins.ganga.2.20091210 ):
run% ganga pandaBackend_test.py
->
>
+The Ganga output looks something like this (Note the output is a dataset: Output dataset user09.chriscollins.ganga.2.20091210 ):
run% ganga pandaBackend_test.py
 * Welcome to Ganga *
Version: Ganga-5-4-3
 Ganga.GPIDev.Lib.Job               : INFO     job 2 status changed to "submitted"
-<
<
+Helpful commands inside Ganga:

jobs lists your jobs

jobs(1) lists the content of job 1

help() goes into help mode ( quit to leave help)

j=jobs(1)  and j.kill() will kill job 1.
->
>
+Helpful commands inside Ganga:
 jobs lists your jobs
 jobs(1) lists the content of job 1
 help() goes into help mode ( quit to leave help)
 j=jobs(1) and j.kill() will kill job 1.
-<
<
+Your output will be in the dq2 registered dataset. For me this was user09.chriscollins.ganga.2.20091210 Again this is available from jobs(x)
->
>
+Your output will be in the dq2 registered dataset. For me this was user09.chriscollins.ganga.2.20091210 Again this is available from jobs(x)
  Running  Executable on panda Backend
 j.submit()

_________________________________________________________________________
->
>
+ Reusing the environment
->
>
+If you're running the same code over multiple datasets, you can save time by telling jobs after the first job to use the same environment. This means that panda doesn't have to run a new `build' job every time, and can make things much faster. This works as long as you don't recompile anything in your environment between jobs, you can change the joboptions.


First you need to find the name of the previous library. Say we want to reuse the library built in ganga job 323: given a job ID:


Either: in ganga:

In [19]:jobs(323).backend.libds

Out[19]: user.nedwards.20100610.323.lib


or use dq2:

$ dq2-ls %nedwards%323*

user.nedwards.20100610.323

user.nedwards.20100610.323.lib

user.nedwards.20100610.323_sub07797849


The library is user.nedwards.20100610.323.lib. In subsequent job options you can set:
->
>
+j.backend.libds = 'user.nedwards.20100610.323.lib'

View topic | History: r15 < r14 < r13 < r12 | More topic actions...