Introduction
Documenting the setup of Higgsbb analysis code used by
cern.
An overview of the software framework can be found
here.
The CERN twiki page for
Diboson studies
may also be useful.
(
Note: If at any stage when setting up an environment or building code you encounter a problem before trying anything else first do
make clean
on anything you have done
make
on then use a clean shell and retry.)
Running Mode
The code can be run in two different modes. Locally or using PROOF(Parallel ROOT Facility). PROOF allows analysis of a large number of ROOT files in parallel using multiple machines or processor cores.
Running Locally
Code Checkout
The
code
can be
checked out
from the correct repository by doing:
mkdir myVH_research
cd myVH_research
svn co svn+ssh://svn.cern.ch/reps/atlasusr/mbellomo/ElectroweakBosons/trunk
Note: you may want to set up
kerberos
first.(I did not require the
GSSAPITrustDNS yes line).
Compiling And Using The Code
(
Note: for more complete instructions see the README file in the trunk subdirectory)
(1) To setup the environment:
cd myVH_research/trunk
source scripts/setup_lxplus.sh
This will setup the required environment variables, the Gnu C++ compiler and ROOTCORE.
(2) To download and compile and extra software required do
(
Note: If the extra code fails to compile due to the MET package you need to remove the code and run the 2012 versions of all the scripts)
./scripts/get_allcode.sh
cd SFrame
source setup.sh
cd ..
(3) Next compile the software by doing
make
(You can also clean up by doing
make clean
and
make distclean
)
(4) For the purposes of analysing H->bb decays the code is stored in the
AnalysisWZorHbb/
directory.We now need to compile this code before we can do the analysis.
cd AnalysisWZorHbb
make
Now we can run our analysis by doing
sframe_main config/WZorHbb_config_mc_nu_2011.xml
for example. The .xml file is where the input files and settings are all defined(e.g. what corrections to apply to the data) and can be modified or written to suit the needs of the analysis. For example to do a H->bb analysis involving zero leptons that makes flat ntuples you need the following lines.
<UserConfig>
&setup;
<Tool Name="BaselineZeroLepton" Class="WZorHbb">
&tool_base;
<Item Name="runNominal" Value = "True"/>
<Item Name="ntupleName" Value = "Ntuple"/>
</Tool>
</UserConfig>
(
Note: every time you open a shell to run the code you need to do steps (1), (3) and (4) again. i.e. setup the environment and make the code. It is also a good idea to run
make clean
before you recompile anything.)
Producing Plots
The output from running sframe is stored in .root files as common ntuples. To produce plots from these output files we need to use
HistMaker
by doing
cd macros
make
to make the required code. To run the code we use the command
./RunHistMaker <path to config file> <runMJ> <path to files output by sframe> <output directory> -- explain better?
e.g.
./RunHistMaker configs/zerolepton.config false $HOME/out/AnalysisManager.mc11_7TeV.ZeroLepton.Nominal.root ./
This will output a .root file containing histograms.
If
RunHistMaker crashes saying it did not find the cross-section(xsec) for a given number(correlating to a data set) you may need to edit the function
InitCrossSections2011()
in the file
macros/src/process.cpp
. You will need to add the following code
allXSecs.push_back(xsection(dataset, cross-section, k factor, 1, "sample name")); -- find out what 1 is.
e.g.
allXSecs.push_back(xsection(109351, 4.0638E-05, 1, 1,"ZHnunu"));
Atlas twiki pages can be used to find the cross-section for
7TeV data,
8 TeV data
and for
general datasets.
For a 2012 event
InitCrossSection2012()
pulls the information from a file called
configs/2012_cross_sections.txt
which can be edited to add the information for your particular dataset
Stacking Plots
Stacked plots can be produced using the
RunPlotMaker.C program.(This uses the plotter.cpp class) The current implementation of the program expects input files to follow a naming convention that contains "OneLepton" in the root file name. This need to be changed depending on what you had in your .xml file. It also has hard coded links to a home directory that need to be changed. Choosing what systematics etc. you want to run is done with command line flags. My current version of the edited
RunPlotMaker.C can be found
here. It has been edited so that you can give it the path to a folder containing your files as a command line argument. It can be run as follows
./RunPlotMaker -Data12 -llbb -NominalLimitFile -Indir $HOME/out/
It expects to find 6 files in the $HOME/out/ folder.
AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.MJ.root
AnalysisManager.data12_8TeV.OneLepton.Electron.Hists.root
AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.MJ.root
AnalysisManager.data12_8TeV.OneLepton.Muon.Hists.root
AnalysisManager.mc12_8TeV.OneLepton.Nominal.Hists.MJ.root
AnalysisManager.mc12_8TeV.OneLepton.Nominal.Hists.root
Note the "OneLepton" in all the file names. This is because I have yet to change the hard coded file name format in the script. The "MJ" files(Multijet) were produced from the
RunHistMaker program by changing the
RunMJ
parameter to
True
. If you do not want to have to rename your files before using this program then simply edit your sframe .xml config file as follows
<InputData Lumi="1.0" NEventsMax="-1" Type="mc12_8TeV" Version="OneLepton.Nominal" Cacheable="True" SkipValid="True">
<input data files here>
<InputTree Name="physics" />
<OutputTree Name="Ntuple" />
</InputData>
For the data files change the
Type
to "data12_8TeV" and the
Version
to "OneLepton.Muon" or "OneLepton.Electron". This does not change the analysis type, only the output file name.
Running Using PROOF on Demand(PoD)
Proof is a framework for running many root analyses in parallel. It can be run locally using PROOF-lite which will utilise multiple cores on one machine or it can be run on many machines usind the
PoD framework.
Installing and Building
To Install PROOF on Demand first we must get and unpack the code.
cd $HOME
wget http://pod.gsi.de/releases/pod/3.10/PoD-3.10-Source.tar.gz
tar -xzvf PoD-3.10-Source.tar.gz
To build the code do
cd $HOME/PoD-3.10-Source
mkdir build
cd build
cmake -C ../BuildSetup.cmake ..
make
make install
cd
cd myVH_research/ElectroweakBosons/trunk
source scripts/setup_pod_lxplus.sh
cd
cd PoD/3.10/
source PoD_env.sh
To change the working directory
PoD uses along with many other settings just edit $HOME/.PoD/PoD.cfg
(
Note: For those using an afs directory you
need
to change your
[server] work_dir
parameter to something like /tmp/$USER/PoD)
Now create a file called
$HOME/.PoD/user_worker_env.sh
and put the line
source /afs/cern.ch/sw/lcg/contrib/gcc/4.3.5/x86_64-slc5-gcc43-opt/setup.sh
in it
Then create a file called
$HOME/.PoD/user_xpd.cf
and put the line
xpd.rootd allow
in it.
DEPRECATED?----------------------------------------------------------------------------------------------------------
Now create a file called
$HOME/.PoD/user_worker_env.sh
and put the following settings in it
#! /usr/bin/env bash
echo "Setting user environment for workers ..."
export LD_LIBRARY_PATH=/afs/cern.ch/sw/lcg/external/qt/4.4.2/x86_64-slc5-gcc43-opt/lib:\
/afs/cern.ch/sw/lcg/external/Boost/1.44.0_python2.6/x86_64-slc5-gcc43-opt//lib:\
/afs/cern.ch/sw/lcg/app/releases/ROOT/5.30.01/x86_64-slc5-gcc43-opt/root/lib:\
/afs/cern.ch/sw/lcg/contrib/gcc/4.3.5/x86_64-slc5-gcc34-opt/lib64:\
/afs/cern.ch/sw/lcg/contrib/mpfr/2.3.1/x86_64-slc5-gcc34-opt/lib:\
/afs/cern.ch/sw/lcg/contrib/gmp/4.2.2/x86_64-slc5-gcc34-opt/lib
echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
END---------------------------------------------------------------------------------------------------------------------------------------
Using PROOF
We now have PROOF installed but before doing any analysis we must first setup the correct environment. It is advised that after installing PROOF you start from a clean shell then do
cd myVH_research/trunk
source scripts/setup_lxplus.sh
source scripts/setup_pod_lxplus.sh
make
cd AnalysisWZorHbb
make
To start the proof server we can do
pod-server start
. You can now request worker nodes. To do this we use the
pod-submit
command. For example to request 20 worker nodes we would do
pod-submit -r lsf -q 1nh -n 20
Where
1nh
is the name of a que we are submitting the jobs to and
lsf
is the name of the resource management system.
Now that we have set up our environment and requested worker nodes we can use the nodes for our analysis. To run analysis we use the same comand as before;
sframe_main config/proof_WZorHbb_config_mc_nu_2011.xml
.
(
Note: It is advisable to use
nohup sframe_main config/proof_WZorHbb_config_mc_nu_2011.xml &> output.txt &
so that terminal crashes will not kill your job)
However our .xml file will look different from our non-PROOF analysis. We must change the
RunMode
option to be
RunMode="PROOF"
and the
ProofServer
option must also be changed. It must look like
ProofServer="username@host:portnumber"
where the port number, host and username are those output from the
pod-server start
command or the
pod-info -c
. For example
ProofServer="pmullen@lxplus438.cern.ch:21002"