(r5) RunningGangaWithPanda < ATLAS

TWiki>

ATLAS Web>RunningGangaWithPanda (revision 5)~~EditAttach~~

-- ThomasDoherty - 2009-10-26

Using Ganga to submit jobs to the Panda backend

References:

Full Ganga Atlas Tutorial

Data preparation reprocessing - using Ganga

1. In a clean shell, setup Ganga.

  source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh

2. Setup the athena release.

NOTE: To set up for any release one must be familar with using CMT (bootstrap procedures and requirement files) - see here for more information. In this case once your requirements file is set up and a directory in your test area for 14.5.2.6 is created for example (for reprocessing see the reference page above for the necessary release) then try:

  source ~/cmthome/setup.sh -tag=14.5.2.6,32,AtlasProduction

3. Setup any checked out packages you use in your code.

For this example check out (and compile) the UserAnalysis package as in the "HelloWorld"example here or here.

  cd $TESTAREA/14.5.2.6/PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt
  source setup.sh

4. Go to run directory and start ganga.

   cd ../run
   ganga

5. Execute your Ganga job script while Ganga is running (where an example of what the 'pandaBackend_test.py' would look like is below in other words have this file in your run directory) and type:

    execfile('pandaBackend_test.py')

6. You can monitor your job's progress by typing jobs inside Ganga or, if you submitted to the Panda backend by http://panda.cern.ch:25880/server/pandamon/query.

7. Once your job has finished you can copy the output data using the dq2 tools.

          source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
   dq2-get "your_dataset_name"

Where "your_dataset_name" is given to you by Ganga once the job completes. And 'pandaBackend_test.py' could look like this (without line numbers):

1    j = Job()
2    j.application = Athena()
3    j.application.atlas_dbrelease = 'ddo.000001.Atlas.Ideal.DBRelease.v06060101:DBRelease-6.6.1.1.tar.gz'
4    j.application.option_file = 'AnalysisSkeleton_topOptions.py'
5    j.application.athena_compile = False
6    j.application.prepare()
7    j.inputdata = DQ2Dataset()
8    j.inputdata.dataset = "data08_cos.00092051.physics_IDCosmic.recon.ESD.o4_r653/"
9    j.outputdata = DQ2OutputDataset()
10   j.backend = Panda()
11   j.splitter = DQ2JobSplitter()
12   j.splitter.numsubjobs = 20
13   j.submit()

NOTE: Line 3 is an example of overriding a database release to match the one needed to read ESD/DPD. In the case of the spring cosmic reprocessing,the DB release is 6.6.1.1. If the database releases don't match the jobs fail on the Grid ( remove this line if it is not necessary). Line 4 corresponds to your Athena jobOptions. You can use the top job options copied from your UserAnalysis packages share directory.

cp ../share/AnalysisSkeleton_topOptions.py .

BUT to prepare your code for running on the Grid there are some changes needed for this Athena JO - please add these lines:

______________________________________________________________________________

include("RecExCommission/RecExCommissionFlags_jobOptions.py" )
ATLASCosmicFlags.useLocalCOOL  = True
# setup DBReplicaSvc to choose closest Oracle replica, configurables style
from AthenaCommon.AppMgr import ServiceMgr
from PoolSvc.PoolSvcConf import PoolSvc
ServiceMgr+=PoolSvc(SortReplicas=True)
from DBReplicaSvc.DBReplicaSvcConf import DBReplicaSvc
ServiceMgr+=DBReplicaSvc(UseCOOLSQLite=False) 

_____________________________________________________________________________

Also remember to remove (or comment out) the input data line and if you are running a reprocessing job change the geometry tag and the conditions DB tag to match those used in the reprocessing cycle (see details for each reprocessing campaign on this page here. For example:

globalflags.ConditionsTag.set_Value_and_Lock('COMCOND-REPC-002-13')

Back to the Ganga JO script:

Line 5 is set to False because we have already compiled the packages locally.
Line 6 tells Ganga to tar your user area and send it with the job.
Line 10 specifies the backend to which you are sending your job. There are three options: LCG, 
Panda and NorduGrid. In the example above Panda was chosen because the data existed only in BNLPANDA,
a site in the US cloud. 
Line 12 corresponds to the number of subjobs you want to split your job into. 
Finally in Line 13 you submit your job.

Topic revision: r5 - 2009-10-29 - ThomasDoherty

ATLAS