Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
-- ThomasDoherty - 2009-10-26
Using Ganga to submit jobs to the Panda backend | ||||||||
Line: 8 to 8 | ||||||||
Data preparation reprocessing - using Ganga![]() | ||||||||
Changed: | ||||||||
< < | 1. In a clean shell, setup Ganga. It is important to setup Ganga before Athena. | |||||||
> > | 1. In a clean shell, setup Ganga. | |||||||
2. Setup the athena release.source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh | ||||||||
Changed: | ||||||||
< < | NOTE: To set up for any release one must be familar with using CMT (bootstrap procedures and requirement files) - see here![]() | |||||||
> > | NOTE: To set up for any release one must be familar with using CMT (bootstrap procedures and requirement files) - see here![]() | |||||||
3. Setup any checked out packages you use in your code.source cmthome/setup.sh -tag=14.5.2.6,32,AtlasProduction | ||||||||
Changed: | ||||||||
< < | cd $TEST/PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt | |||||||
> > |
For example check out (and compile) the UserAnalysis package as in the "HelloWorld"example here![]() ![]() cd $TEST/PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt source setup.sh | |||||||
Changed: | ||||||||
< < | > source setup.sh
4. Go to run directory and start ganga. > cd ../run > ganga 5. Execute your job script. In[0]: execfile('cbarrera_test2.py') 6. You can monitor your job's progress by typing *jobs* inside Ganga or, if you submitted to the Panda backend by http://panda.cern.ch:25880/server/pandamon/query. 7. Once your job has finished you can copy the output data using the dq2 tools. > source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh > dq2-get | |||||||
> > | 4. Go to run directory and start ganga.
5. Execute your Ganga job script while Ganga is running (where an example of what the 'pandaBackend_test.py' would look like is below in other words have this filecd ../run ganga ![]() execfile('pandaBackend_test.py') | |||||||
Changed: | ||||||||
< < | yourDataObviously it's not as simple as that... | |||||||
> > | 6. You can monitor your job's progress by typing jobs inside Ganga or, if you submitted to the Panda backend by http://panda.cern.ch:25880/server/pandamon/query![]() | |||||||
Changed: | ||||||||
< < | Here is the job file I submitted using ganga: | |||||||
> > | 7. Once your job has finished you can copy the output data using the dq2 tools.
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh dq2-get "your_dataset_name" | |||||||
Changed: | ||||||||
< < | 1 j = Job() | |||||||
> > | Where "your_dataset_name" is given to you by Ganga once the job completes. And 'pandaBackend_test.py' could look like this (without line numbers):
1 j = Job() | |||||||
2 j.application = Athena() 3 j.application.atlas_dbrelease = 'ddo.000001.Atlas.Ideal.DBRelease.v06060101:DBRelease-6.6.1.1.tar.gz' 4 j.application.option_file = 'Data_jobOptions_cosmic.py' | ||||||||
Line: 54 to 62 | ||||||||
12 j.splitter.numsubjobs = 20 13 j.submit() | ||||||||
Changed: | ||||||||
< < | A few comments on some of these lines. Line 3 is overriding the database release to match the one needed to read ESD/DPD. In the case of the spring cosmic reprocessing, the DB release is 6.6.1.1. If the database releases don't match the jobs fail on the Grid. I don't understand why it works locally though (maybe a question for Graeme). Line 4 corresponds to your jobOptions. There were some changes I needed to do in mine as well, to prepare my code for running on the Grid: | |||||||
> > |
NOTE: Line 3 is overriding the database release to match the one needed to read ESD/DPD. In the case of the spring cosmic reprocessing,the DB release is 6.6.1.1. If the database releases don't match the jobs fail on the Grid. Line 4 corresponds to your Athena jobOptions. You can use the top job options copied from your UserAnalysis packages share directory.
cp ../share/AnalysisSkeleton_topOptions.py . | |||||||
Changed: | ||||||||
< < | __________________________________________________________________________ | |||||||
> > | BUT to prepare your code for running on the Grid there are some changes needed for this Athena JO - please add these lines:
______________________________________________________________________________ | |||||||
include("RecExCommission/RecExCommissionFlags_jobOptions.py" ) ATLASCosmicFlags.useLocalCOOL = True | ||||||||
Line: 69 to 80 | ||||||||
from DBReplicaSvcConf import DBReplicaSvc ServiceMgr+=DBReplicaSvc(UseCOOLSQLite=False) | ||||||||
Changed: | ||||||||
< < | globalflags.ConditionsTag.set_Value_and_Lock('COMCOND-REPC-002-13') __________________________________________________________________________ | |||||||
> > | _________________________________________________________________________ | |||||||
Changed: | ||||||||
< < | Also remember to remove the input data line in the original JO's (from second Reference webpage). Line 5 is set to False because we have already compiled the packages
locally. Line 6 tells Ganga to tar your user area and send it with the job. Line 10 specifies the backend to which you are sending your job. There are three options: LCG,
Panda and NorduGrid. I chose Panda because my data existed only in BNLPANDA, a site in the US cloud. And apparently, as Graeme told us yesterday, that's the way to go:
prepare your jobs with Ganga and then submit to PanDa. Line 12 corresponds to the number of subjobs you want to split your job into. I read in this page
https://twiki.cern.ch/twiki/bin/view/Atlas/DAGangaFAQ#DQ2JobSplitter_won_t_submit_when![]() | |||||||
> > | Also remember to remove (or comment out) the input data line and change the geometry tag and the conditions DB tag to match those used in the reprocessing cycle (see details for each reprocessing campaign on this page here![]() globalflags.ConditionsTag.set_Value_and_Lock('COMCOND-REPC-002-13') | |||||||
Changed: | ||||||||
< < | This instructions are just guidance. I am sure unexpected things will emerge when dealing with different cases. | |||||||
> > | The athena JO's used for this specific example (Data_jobOptions_cosmic.py and Settings_DepletionDepth.py) can be found here![]() ![]() | |||||||
Added: | ||||||||
> > | Back to the Ganga JO script:
Line 5 is set to False because we have already compiled the packages locally. Line 6 tells Ganga to tar your user area and send it with the job. Line 10 specifies the backend to which you are sending your job. There are three options: LCG, Panda and NorduGrid. I chose Panda because my data existed only in BNLPANDA, a site in the US cloud. Line 12 corresponds to the number of subjobs you want to split your job into. Finally in Line 13 you submit your job. | |||||||