Run 2 H->bb twiki
Twiki to document the Run 2 H->bb analysis. Will also cover useful code stuff.
xAODs
The new data format for run 2 is xAOD. It is a root readable AOD.
setupATLAS
asetup here,19.1.X.Y-VAL,rel_3,AtlasDerivation
checkSG.py xAOD.pool.root
Shower Deconstruction
Shower deconstruction is a jet substructure method that uses fat jets reclustered into small R jets to discriminate between signal and background. It takes these reclustered jets and calculates a how similar to a signal model they are. It runs every possible permutation when assigning each subjet to, in our case, two b jets, underlying event ISR and FSR(have I missed one?) and sums every permutation. It then does the same for background. The ration of these two values is called Χ and is the variable used to cut/train on.
REFERENCES!!!!
CERN Run 2 Analysis Framework
The VH group analysis framework is based on
EventLoop. Documentation for which can be found here
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/EventLoop
The VH twiki can be found here
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/CxAODFramework
To check out the package create a directory to keep the framework in. I will refer to this as $MyAnalysisDir. Go to $MyAnalysisDir and do the following
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODMaker/trunk CxAODMaker
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODReader/trunk CxAODReader
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/CxAODTools/trunk CxAODTools
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/FrameworkExe/trunk FrameworkExe
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/FrameworkSub/trunk FrameworkSub
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/TupleMaker/trunk TupleMaker
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/CxAODFramework/TupleReader/trunk TupleReader
Local Test Run
To run the framework you need to check out all the packages
then do
setupATLAS
rcSetup Base,2.0.23
rc find_packages
rc compile
hsg5framework myDir
This will automatically make a directory called myDir and inside it you will find a directory called
data-outputLabel/
which contains a file(a "CxAOD" that has an xAOD structure) with all the trees saved by the analysis framework. There is also another folder called
output-outputLabel/
which contains a file with some metadata information stored inside. There is also a command
hsg5frameworkTuple
that will produce both a
CxAOD and a ntuple. The folder structure should be the same but with extra
data-tuple/
and
output-tuple/
directories.
Next time you log in you only have to go to your analysis directory and run
setupATLAS
rcSetup
and you should be able to now run the framework.
Running On The Grid
Running on the grid uses the
EventLoop grid plugin for submitting jobs. Details of this can also be found on the
EventLoop twiki linked above. To submit to the grid start from a clean terminal and do
setupATLAS
localSetupDQ2Client --skipConfirm # just for dq2-get -> no need for job submission
voms-proxy-init -voms atlas
localSetupPandaClient --noAthenaCheck
rcSetup
The command to submit to the grid is
python FrameworkExe/scripts/run_grid.py MyDatasetList.txt
The file
MyDatasetList.txt should be a list of datasets with a linebreak after each one. e.g.
mc14_8TeV.117050.PowhegPythia_P2011C_ttbar.merge.DAOD_HIGG5D2.e1727_s1933_s1911_r5591_r5625_p1784
data12_8TeV.00204240.physics_Muons.merge.DAOD_HIGG5D2.r5724_p1751_p1784
The file itself must be located in $MyAnalysisDir/FrameworkSub/In/. Yo will also want to edit $MyAnalysisDir/FrameworkExe/data/framework-run.cfg. It contains the
vtag
variable that can be used to configure the output names.
The python script runs the
hsg5frameworkTuple
command for each of the datasets in the .txt document. It creates an output name that is currently ignored by the C++ which builds the output name itself using the
vtag
variable.
Making Plots From Your Shiny New Grid Files
To make plots there is an executable called
hsg5frameworkReadCxAOD
that currently has some hard coded information you will need to edit. If you open
FrameworkExe/util/hsg5frameworkReadCxAOD.cxx
and edit the variable
dataset_dir
. This is where the input files should be located. There is also a vector called
sample_names
which contains the names of the folders for each seperate background. The code will make one file filled with histograms for each of the backgrounds in this vector.
(Note:
FrameworkExe/data/framework-read.cfg is the config file. Currently it only sets the number of events and the analysis type(0, 1 or 2 lepton).
An example might make things clearer. Imagine a simple analysis that just has muon data, top and Zbb. The folder structure should be as follows
/my/output/folder/
/my/output/folder/muon_data
/my/output/folder/muon_data/*data-outputLabel*
/my/output/folder/muon_data/*output-outputLabel*
/my/output/folder/top
....
/my/output/folder/Zbb
.....
(This is convenient because it means that we can go to our
muon_data
directory and run
dq2-get
to download all our muon data here and the code will automatically pick up all the output files from different grid sites because of the wildcard(*) character and run over all of them.)
You then need to add the names muon_data, top and Zbb to the
sample_names
vector in
FrameworkExe/util/hsg5frameworkReadCxAOD.cxx
and recompile by doing
rc compile
from $MyAnalysisDir. You can now run the
hsg5frameworkReadCxAOD MyOutDir
executable.
You can also set the cross section for each sample in this file. You do this using sample handler. An example for setting the top cross:
SH::Sample* top=sampleHandler.get ("topMC");
if (top) top->setMetaDouble ("sigmaEff", 252.89*0.543);
This adds metadata to the output file that can be accessed later on when making the stacked plots.
This will make output files hist-Zbb.root, hist-muon_data.root, hist-top.root.
To make stacked plots from these files first do
svn co svn+ssh://svn.cern.ch/reps/atlasphys/Physics/Higgs/HSG5/software/VHAnalysis/LHCRun2/InputsProcessingTools/PlottingTool/ PlottingTool
We then need to edit
PlotCxAODReader.cxx and makePlots2Lepton.cxx. In the former there is a map called sampleNames. It maps the names used in the sampleNames vector from above to the names used in the makePlots file with
addBackgroundSample("name from Reader", ""name that appears on plot legend", colour)
. This file is also where the cross section information saved as metadata above is accessed and used for normalisation.
Editing the latter file is just to select which plots to make etc.
We can then do
root -b -q 'FrameworkExe/macros/runCxAODPlots.cxx("MyOutDir")'
where
MyOutDir is the directory that contains the hist-*.root files.
--
Paul Mullen - 2014-10-02
Comments