Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
IntroductionDocumenting the setup of Higgsbb analysis code used by cern.![]() ![]() ![]() ![]() make clean on anything you have done make on then use a clean shell and retry.)
Running ModeThe code can be run in two different modes. Locally or using PROOF(Parallel ROOT Facility). PROOF allows analysis of a large number of ROOT files in parallel using multiple machines or processor cores.Running LocallyCode CheckoutThe latest version of the code![]() ![]() | ||||||||
Deleted: | ||||||||
< < | ||||||||
mkdir myVH_research cd myVH_research svn co svn+ssh://svn.cern.ch/reps/atlasusr/mbellomo/ElectroweakBosons/trunkHowever it is probably best to use the latest tagged version of the code as this should (but wont necessarily) be stable. This can be checked out by doing | ||||||||
Deleted: | ||||||||
< < | ||||||||
mkdir myVH_research cd myVH_research svn co svn+ssh://svn.cern.ch/reps/atlasusr/mbellomo/ElectroweakBosons/tags ElectroweakBosons-xx-xx-xxWhere xx-xx-xx is the tag number. Note: you may want to set up kerberos ![]() Compiling the Code(Note: for more complete instructions see the README file in the trunk subdirectory) (1) To setup the environment: | ||||||||
Deleted: | ||||||||
< < | ||||||||
cd myVH_research/trunk source scripts/setup_lxplus.shThis will setup the required environment variables, the Gnu C++ compiler and ROOTCORE. (2) To download and compile and extra software required do (Note: If the extra code fails to compile due to the MET package you need to remove the code and run the 2012 versions of all the scripts) | ||||||||
Added: | ||||||||
> > | ||||||||
./scripts/get_allcode.sh cd SFrame source setup.sh cd ..(3) Next compile the software by doing | ||||||||
Deleted: | ||||||||
< < | ||||||||
make(You can also clean up by doing make clean and make distclean )
(4) For the purposes of analysing H->bb decays the code is stored in the AnalysisWZorHbb/ directory.We now need to compile this code before we can do the analysis. | ||||||||
Deleted: | ||||||||
< < | ||||||||
cd AnalysisWZorHbb make Configuring and RunningNow we can run our analysis by doingsframe_main config/WZorHbb_config_mc_nu_2011.xml for example. The .xml file is where the input files and settings are all defined(e.g. what corrections to apply to the data) and can be modified or written to suit the needs of the analysis. For example to do a H->bb analysis involving zero leptons that makes flat ntuples you need the following lines. | ||||||||
Deleted: | ||||||||
< < | ||||||||
<UserConfig> &setup; <Tool Name="BaselineZeroLepton" Class="WZorHbb"> &tool_base; <Item Name="runNominal" Value = "True"/> <Item Name="ntupleName" Value = "Ntuple"/> </Tool> </UserConfig>The "BaselineZeroLepton" option is what sets the analysis type. The string is parsed to determine whether we are doing a 0, 1 or 2 lepton analysis and also if we want to do an electron, muon or nominal analysis. (See line 250ish in WZorHbb.cxx for the code that performs the selection) (Note: every time you open a shell to run the code you need to do steps (1), (3) and (4) from the compilation instructions again. i.e. setup the environment and make the code. It is also a good idea to run make clean before you recompile anything.)
(Note: It is advisable to use BaselineOneLepton for both the 1 lepton and 2 lepton analysis then diffirentiate between the two in the config file that is passed to RunHistmaker.C)
In order to do a full analysis with the whole CERN framework this code must produce three files. One for montecarlo, one for electrons and one for muons. To do this three separate "cycles" must be defined in the .xml file. A cycle is defined as follows | ||||||||
Deleted: | ||||||||
< < | ||||||||
<Cycle Name="AnalysisManager" RunMode="LOCAL" ProofServer="lite://" ProofWorkDir="/afs/cern.ch/user/p/pmullen/tag-tmp/" ProofNodes="-1" UseTreeCache="True" TreeCacheSize="30000000" TreeCacheLearnEntries="10" OutputDirectory="/afs/cern.ch/user/p/pmullen/tag-tmp/" PostFix="" TargetLumi="1.0"> <InputData Lumi="1.0" NEventsMax="-1" Type="mc12_8TeV" Version="OneLepton.Nominal" Cacheable="True" SkipValid="True"> &pauls_mc; <InputTree Name="physics" /> <OutputTree Name="Ntuple" /> </InputData> <UserConfig> &setup; <Item Name="CorrectFatJet" Value="False"/> <Tool Name="BaselineOneLepton" Class="WZorHbb"> &tool_base; <Item Name="runNominal" Value = "True"/><!--true--> <Item Name="ntupleName" Value = "Ntuple"/> </Tool> </UserConfig> </Cycle>Where &pauls_mc is a file with paths to the montecarlo files being input. For contrast another cycle configuration is shown below; this time for an electron analysis in data. (Note: The first cycle must be closed with the flag before opening another cycle) | ||||||||
Deleted: | ||||||||
< < | ||||||||
<Cycle Name="AnalysisManager" RunMode="LOCAL" ProofServer="lite://" ProofWorkDir="/afs/cern.ch/user/p/pmullen/tag-tmp/" ProofNodes="-1" UseTreeCache="True" TreeCacheSize="30000000" TreeCacheLearnEntries="10" OutputDirectory="/afs/cern.ch/user/p/pmullen/tag-tmp/" PostFix="" TargetLumi="1.0"> <InputData Lumi="1.0" NEventsMax="-1" Type="data12_8TeV" Version="OneLepton.Electron" Cacheable="True" SkipValid="True"> &pauls_data; <InputTree Name="physics" /> <OutputTree Name="Ntuple" /> </InputData> <UserConfig> &setup; <Item Name="CorrectFatJet" Value="False"/> <Tool Name="OneLepton" Class="WZorHbb"> &tool_base; <Item Name="runNominal" Value = "True"/> <Item Name="ntupleName" Value = "Ntuple"/> </Tool> </UserConfig> </Cycle>The config file and the file containing the paths to the montecarlo files are attached for reference. Producing PlotsThe output from running sframe is stored in .root files as common ntuples. To produce plots from these output files we need to useHistMaker by doing | ||||||||
Deleted: | ||||||||
< < | ||||||||
cd macros maketo make the required code. To run the code we use the command | ||||||||
Deleted: | ||||||||
< < | ||||||||
./RunHistMaker <path to config file> <runMJ> <path to files output by sframe> <output directory> -- explain better? e.g. ./RunHistMaker configs/zerolepton.config false $HOME/out/AnalysisManager.mc11_7TeV.ZeroLepton.Nominal.root ./This will output a .root file containing histograms. If RunHistMaker crashes saying it did not find the cross-section(xsec) for a given number(correlating to a data set) you may need to edit the function InitCrossSections2011() in the file macros/src/process.cpp . You will need to add the following code | ||||||||
Deleted: | ||||||||
< < | ||||||||
allXSecs.push_back(xsection(dataset, cross-section, k factor, 1, "sample name")); -- find out what 1 is. e.g. allXSecs.push_back(xsection(109351, 4.0638E-05, 1, 1,"ZHnunu"));Atlas twiki pages can be used to find the cross-section for 7TeV data, ![]() ![]() ![]() InitCrossSection2012() pulls the information from a file called configs/2012_cross_sections.txt which can be edited to add the information for your particular dataset
Adding Extra PlotsStacking PlotsStacked plots can be produced using the RunPlotMaker.C program.(This uses the plotter.cpp class) The current implementation of the program expects input files to follow a naming convention that contains "OneLepton" in the root file name. This need to be changed depending on what you had in your .xml file. It also has hard coded links to a home directory that need to be changed. Choosing what systematics etc. you want to run is done with command line flags. My current version of the edited RunPlotMaker.C can be found here. It has been edited so that you can give it the path to a folder containing your files as a command line argument. It can be run as follows | ||||||||
Deleted: | ||||||||
< < | ||||||||
./RunPlotMaker -Data12 -llbb -NominalLimitFile -Indir $HOME/out/It expects to find 6 files in the $HOME/out/ folder. | ||||||||
Deleted: | ||||||||
< < | ||||||||
AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.MJ.root AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.root AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.MJ.root AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.root AnalysisManager.mc12_8TeV.OneLepton.Nominal.Hists.MJ.root AnalysisManager.mc12_8TeV.OneLepton.Nominal.Hists.rootNote the "OneLepton" in all the file names. This is because I have yet to change the hard coded file name format in the script. The "MJ" files(Multijet) were produced from the RunHistMaker program by changing the RunMJ parameter to True . If you do not want to have to rename your files before using this program then simply edit your sframe .xml config file as follows | ||||||||
Added: | ||||||||
> > | ||||||||
<InputData Lumi="1.0" NEventsMax="-1" Type="mc12_8TeV" Version="OneLepton.Nominal" Cacheable="True" SkipValid="True"> <input data files here> <InputTree Name="physics" /> <OutputTree Name="Ntuple" /> </InputData>For the data files change the Type to "data12_8TeV" and the Version to "OneLepton.Muon" or "OneLepton.Electron". This does not change the analysis type, only the output file name.
To change the Analysis type you change the Tool Name="BaselineTwoLepton" to BaselineTwoLeptonElectron or BaselineTwoLeptonMuon
Adding Extra Stacked PlotsRunning Using PROOF on Demand(PoD)Proof is a framework for running many root analyses in parallel. It can be run locally using PROOF-lite which will utilise multiple cores on one machine or it can be run on many machines usind the PoD framework.Installing and BuildingTo Install PROOF on Demand first we must get and unpack the code. | ||||||||
Deleted: | ||||||||
< < | ||||||||
cd $HOME wget http://pod.gsi.de/releases/pod/3.10/PoD-3.10-Source.tar.gz tar -xzvf PoD-3.10-Source.tar.gzTo build the code do | ||||||||
Deleted: | ||||||||
< < | ||||||||
cd $HOME/PoD-3.10-Source mkdir build cd build cmake -C ../BuildSetup.cmake .. make make install cd cd myVH_research/ElectroweakBosons/trunk source scripts/setup_pod_lxplus.sh cd cd PoD/3.10/ source PoD_env.shTo change the working directory PoD uses along with many other settings just edit $HOME/.PoD/PoD.cfg (Note: For those using an afs directory you need ![]() [server] work_dir parameter to something like /tmp/$USER/PoD)
Now create a file called $HOME/.PoD/user_worker_env.sh and put the line source /afs/cern.ch/sw/lcg/contrib/gcc/4.3.5/x86_64-slc5-gcc43-opt/setup.sh in it
Then create a file called $HOME/.PoD/user_xpd.cf and put the line xpd.rootd allow in it.
DEPRECATED?----------------------------------------------------------------------------------------------------------
Now create a file called $HOME/.PoD/user_worker_env.sh and put the following settings in it | ||||||||
Deleted: | ||||||||
< < | ||||||||
#! /usr/bin/env bash echo "Setting user environment for workers ..." export LD_LIBRARY_PATH=/afs/cern.ch/sw/lcg/external/qt/4.4.2/x86_64-slc5-gcc43-opt/lib:\ /afs/cern.ch/sw/lcg/external/Boost/1.44.0_python2.6/x86_64-slc5-gcc43-opt//lib:\ /afs/cern.ch/sw/lcg/app/releases/ROOT/5.30.01/x86_64-slc5-gcc43-opt/root/lib:\ /afs/cern.ch/sw/lcg/contrib/gcc/4.3.5/x86_64-slc5-gcc34-opt/lib64:\ /afs/cern.ch/sw/lcg/contrib/mpfr/2.3.1/x86_64-slc5-gcc34-opt/lib:\ /afs/cern.ch/sw/lcg/contrib/gmp/4.2.2/x86_64-slc5-gcc34-opt/lib echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"END--------------------------------------------------------------------------------------------------------------------------------------- Using PROOFWe now have PROOF installed but before doing any analysis we must first setup the correct environment. It is advised that after installing PROOF you start from a clean shell then do | ||||||||
Added: | ||||||||
> > | ||||||||
cd myVH_research/trunk source scripts/setup_lxplus.sh source scripts/setup_pod_lxplus.sh make cd AnalysisWZorHbb make(Note: It is advisable to delete your ~/.proof and /tmp/$USER directories before starting proof) To start the proof server we can do pod-server start . You can now request worker nodes. To do this we use the pod-submit command. For example to request 20 worker nodes we would do | ||||||||
Deleted: | ||||||||
< < | ||||||||
pod-submit -r lsf -q 1nh -n 20Where 1nh is the name of a que we are submitting the jobs to and lsf is the name of the resource management system.
Now that we have set up our environment and requested worker nodes we can use the nodes for our analysis. To run analysis we use the same comand as before; sframe_main config/proof_WZorHbb_config_mc_nu_2011.xml .
(Note: It is advisable to use nohup sframe_main config/proof_WZorHbb_config_mc_nu_2011.xml &> output.txt & so that terminal crashes will not kill your job)
However our .xml file will look different from our non-PROOF analysis. We must change the RunMode option to be RunMode="PROOF" and the ProofServer option must also be changed. It must look like ProofServer="username@host:portnumber" where the port number, host and username are those output from the pod-server start command or the pod-info -c . For example ProofServer="pmullen@lxplus438.cern.ch:21002"
Adding D3PD variables to our AlgorithmIn order to use D3PD variables that are not already used by the CERN code you must first find the details of the variable. To do this it is easiest to load the root file in CINT (root -l myfile.root ) then use the MakeClass function of root to produce a header file that could be used to read the tree. This can be done on the CINT command line by doing treename->MakeClass("MyClassName") . This will write out a MyClassName.C and a MyClassName.h file. The details of how to read the variable you want can then be found in these files.
Once you have the details of the variable we can now add it to the AnalysisBase /include/EventBase.h and the AnalysisBase /include/Event.h file so that any code that inherits from this code is aware of the variable(the AnalysisWZorHbb code inherits from this as does all the code that comes from the CERN SVN package)
Now that we have done this we can add our variable to our function. In the case of the AnalysisWZorHbb code we would go to /AnalysisWZorHbb/src/WZorHbb.cxx and add ConnectVariable("MyD3PDVariableName") to the BeginInputFile function.
For instance for the jet_AntiKt4TopoEM_WIDTH variable we would find in MyClassName.h the line vector . We would then go to AnalysisBase /include/EventBase.h and add this line there. We would then go to /AnalysisWZorHbb/src/WZorHbb.cxx and add ConnectVariable("jet_WIDTH") . Note that we have removed the name of the algorithm from the variable name. This is the convention used. The algorithm name is added by the ConnectVariable function. We must also add vector to AnalysisBase /include/Event.h .
Adding the Variable to The Output TreeFor the variable to be written to the TTree in the SFrame section of the code we must create an output variable. To do this first add a variable of the correct type to the Algorithms header file. Now go to the Algorithms source file and useDeclareVariable in the BeginInputData() function to declare a branch in the output tree. Make sure to also clear() the variable in the ResetNtupleVars() function. You can now fill this variable in the fillNTuple() function probably using push_back .
As an example lets say we want to output the jet_WIDTH variable from the previous section. We would first add std::vector in the header file. In the source file we would go to the BeginInputData() function and add DeclareVariable(pauls_jet_WIDTH, "pauls_jet_WIDTH", treename); where pauls_jet_WIDTH is the name of the branch in our output tree and treename is our output tree; Also add pauls_jet_WIDTH.clear(); in the fillNTuple() function. To fill this variable we would then add pauls_jet_WIDTH.push_back(ev.jet_WIDTH->at(i)); to the fillNTuple() function.
Using RunPlotMakerMaking an Input FileTo use RunPlotMaker to make an input file do the following. Get a file named listMCFiles.lst that lists all the montecarlo background input files to RunHistMaker. This can usually be found on eos. (I got it from /eos/atlas/user/g/gfacini/Hbb/Pub2014/SFrame/2012/NTuple_20140119/listMCFiles.lst) The run | ||||||||
Added: | ||||||||
> > | ||||||||
source scripts/tar_area.sh SomeName_v0.0This creates a directory of that name in ~/work. You then do | ||||||||
Added: | ||||||||
> > | ||||||||
source scripts/submit2012.sh. This file can be edited to decide which systematics to inclue. For none just set the systematic to Nominal. This will submit all the jobs to the batch and will return the output to your eos directory named something like /eos/atlas/user/p/pmullen/Hbb/Pub2014/histMaker/2012. (Note: When the directory is tarred it creates a file called scripts/run_now.txt that has Some_Name_v0.0 in it and the Submit2012.sh script looks in this text file for the name of the tarball to use for the submission) The next step is to calculate the MJ scale factors and scale the plots accordingly. To do this run | ||||||||
Added: | ||||||||
> > | ||||||||
./RunPlotMaker -config ./configs/plot.config -MJScalesIf you are making a file with full systematics you will want to run the -MJScales multiple times. One for each of the systematics you want to scale. For example run | ||||||||
Deleted: | ||||||||
< < | ||||||||
./RunPlotMaker -config ./configs/MyFirstSystematic.config -MJScalesWhere in the config file you replace the MCMJ, DataElMJ and DataMuMJ files with their systematic versions. | ||||||||
Deleted: | ||||||||
< < | ||||||||
(Note: To make a basic set of plots now do ./RunPlotMaker -config ./configs/plot.config )
to calculate the scale factory then run | ||||||||
Added: | ||||||||
> > | ||||||||
./RunPlotMaker -config ./configs/plot.config -CheckForFilesto make sure all the required files are present.(Note: This will not check the files are of a reasonable size so if something went wrong but the file was returned as empty it will think nothing is wrong) Next we do | ||||||||
Deleted: | ||||||||
< < | ||||||||
source scripts/tar_area.sh SomeName_v0.0_LimitFile source scripts/submitInputFile.sh plot.config datestamp_SomeName_v0.0_LimitFileThis tars the files and sends them off to the batch to scale the plots using the calculated scale factors and writing a lot of files to your work directory to be used for making an limit file. (Note: You do not need configs/plot.config just plot.config as the bash script that untars the file on the batch will take care of prepending configs/) When thi is done you can do | ||||||||
Deleted: | ||||||||
< < | ||||||||
./RunPlotMaker -config ./config/plot.config -NominalLimitFile ./RunPlotMaker -config ./config/plot.config -LimitFileThis will produce a limit file. A little about the plot.config fileThe plot.Config file will look something like the following | ||||||||
Deleted: | ||||||||
< < | ||||||||
VersionMC 20140515_LimitFile_v5.3 TagMJ True CDI_CALIB ttbar_all DataType Data12 YearName 2012 #ScaleFitRegion MJFit ScaleFitRegion 2BTag #ScaleFitRegion MJFit2BTag ScaleFitVar met #ScaleFitVar mtw #ScaleFitVar lep1Pt UseEOS True #OutTag 2BTagScales Blind True # MergedLepton, Electron or Muon LepFlavour MergedLepton SplitCharge False # Directory to store input plot outputs - could do to make this nicer to you don't need to input username FitDirectory /afs/cern.ch/work/p/pmullen/analysis/FitInputs/ #DataMETFile #DataMETMJFile DataElFile AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.root DataElMJFile AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.MJ.SysMJShapeDo.root DataMuFile AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.root DataMuMJFile AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.MJ.root MCFile AnalysisManager.mc12_8TeV_p1328.OneLepton.Hists.root MCMJFile AnalysisManager.mc12_8TeV_p1328.OneLepton.Hists.MJ.root LimitFileVersion v5.3If you want to look at the MJ template and its electroweak contamination you can make DataElFile and DataElMJFile be the MJ template, comment out the MCMJ file and make the MC file the MC MJ template. | ||||||||
Deleted: | ||||||||
< < | ||||||||
|