Twiki to document the Run 2 H->bb analysis. Will also cover useful code stuff.
The new data format for run 2 is xAOD. It is a root readable AOD.
setupATLAS asetup here,19.1.X.Y-VAL,rel_3,AtlasDerivation checkSG.py xAOD.pool.root
Shower deconstruction is a jet substructure method that uses fat jets reclustered into small R jets to discriminate between signal and background. It takes these reclustered jets and calculates a how similar to a signal model they are. It runs every possible permutation when assigning each subjet to, in our case, two b jets, underlying event ISR and FSR(have I missed one?) and sums every permutation. It then does the same for background. The ration of these two values is called Χ and is the variable used to cut/train on. REFERENCES!!!!
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/EventLoop
The VH twiki can be found here
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/CxAODFramework
To check out the package create a directory to keep the framework in. I will refer to this as $MyAnalysisDir. Go to $MyAnalysisDir and do the following
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODMaker/trunk CxAODMaker svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODReader/trunk CxAODReader svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODTools/trunk CxAODTools svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/FrameworkExe/trunk FrameworkExe svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/FrameworkSub/trunk FrameworkSub svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/TupleMaker/trunk TupleMaker svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/TupleReader/trunk TupleReader
then do
setupATLAS rcSetup Base,2.0.23 rc find_packages rc compile hsg5framework myDir
This will automatically make a directory called myDir and inside it you will find a directory called data-outputLabel/
which contains a file(a "CxAOD" that has an xAOD structure) with all the trees saved by the analysis framework. There is also another folder called output-outputLabel/
which contains a file with some metadata information stored inside. There is also a command hsg5frameworkTuple
that will produce both a CxAOD and a ntuple. The folder structure should be the same but with extra data-tuple/
and output-tuple/
directories.
Next time you log in you only have to go to your analysis directory and run
setupATLAS rcSetup
and you should be able to now run the framework.
setupATLAS localSetupDQ2Client --skipConfirm # just for dq2-get -> no need for job submission voms-proxy-init -voms atlas localSetupPandaClient --noAthenaCheck rcSetup
The command to submit to the grid is
python FrameworkExe/scripts/run_grid.py MyDatasetList.txt
The file MyDatasetList.txt should be a list of datasets with a linebreak after each one. e.g.
mc14_8TeV.117050.PowhegPythia_P2011C_ttbar.merge.DAOD_HIGG5D2.e1727_s1933_s1911_r5591_r5625_p1784 data12_8TeV.00204240.physics_Muons.merge.DAOD_HIGG5D2.r5724_p1751_p1784
The file itself must be located in $MyAnalysisDir/FrameworkSub/In/. Yo will also want to edit $MyAnalysisDir/FrameworkExe/data/framework-run.cfg. It contains the vtag
variable that can be used to configure the output names.
The python script runs the hsg5frameworkTuple
command for each of the datasets in the .txt document. It creates an output name that is currently ignored by the C++ which builds the output name itself using the vtag
variable.
To make plots there is an executable called hsg5frameworkReadCxAOD
that currently has some hard coded information you will need to edit. If you open FrameworkExe/util/hsg5frameworkReadCxAOD.cxx
and edit the variable dataset_dir
. This is where the input files should be located. There is also a vector called sample_names
which contains the names of the folders for each seperate background. The code will make one file filled with histograms for each of the backgrounds in this vector.
(Note: FrameworkExe/data/framework-read.cfg is the config file. Currently it only sets the number of events and the analysis type(0, 1 or 2 lepton).
An example might make things clearer. Imagine a simple analysis that just has muon data, top and Zbb. The folder structure should be as follows
/my/output/folder/ /my/output/folder/muon_data /my/output/folder/muon_data/*data-outputLabel* /my/output/folder/muon_data/*output-outputLabel* /my/output/folder/top .... /my/output/folder/Zbb .....
(This is convenient because it means that we can go to our muon_data
directory and run dq2-get
to download all our muon data here and the code will automatically pick up all the output files from different grid sites because of the wildcard(*) character and run over all of them.)
You then need to add the names muon_data, top and Zbb to the sample_names
vector in FrameworkExe/util/hsg5frameworkReadCxAOD.cxx
and recompile by doing rc compile
from $MyAnalysisDir. You can now run the hsg5frameworkReadCxAOD MyOutDir
executable.
You can also set the cross section for each sample in this file. You do this using sample handler. An example for setting the top cross:
SH::Sample* top=sampleHandler.get ("topMC"); if (top) top->setMetaDouble ("sigmaEff", 252.89*0.543);
This adds metadata to the output file that can be accessed later on when making the stacked plots.
This will make output files hist-Zbb.root, hist-muon_data.root, hist-top.root.
To make stacked plots from these files first do
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/InputsProcessingTools/PlottingTool/ PlottingTool
We then need to edit PlotCxAODReader.cxx and makePlots2Lepton.cxx. In the former there is a map called sampleNames. It maps the names used in the sampleNames vector from above to the names used in the makePlots file with addBackgroundSample("name from Reader", ""name that appears on plot legend", colour)
. This file is also where the cross section information saved as metadata above is accessed and used for normalisation.
Editing the latter file is just to select which plots to make etc.
We can then do root -b -q 'FrameworkExe/macros/runCxAODPlots.cxx("MyOutDir")'
where MyOutDir is the directory that contains the hist-*.root files.