Difference: HiggsAnalysisAtATLASUsingRooStats (36 vs. 37)

Revision 372012-01-18 - WilliamBreadenMadden

Line: 1 to 1
-- WilliamBreadenMadden - 2012-01-13
-- WilliamBreadenMadden - 2012-01-18
Line: 14 to 14
  RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project developes quickly.

Using the appropriate version of ROOT at Glasgow


Using ROOT on the Glasgow PPELX network

  There are instructions on how to use the different versions of ROOT at Glasgow here.
Line: 23 to 23
  root -v -b

--++ Using ROOT on the CERN LXPLUS network

Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:

. /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh
cd /afs/cern.ch/sw/lcg/app/releases/ROOT/5.32.00/x86_64-slc5-gcc43-opt/root/
. bin/thisroot.sh

Setting up RooStats

There are three main options available for acquiring ROOT with RooStats included.

Line: 859 to 869
  ATLAS recommends the use of the profile likelihood as a test statistic.

Full significance calculation example, from histogram creation to model production (using hist2workspace), to RooStats calculations

Prepare a working area.

Make the main directory.

cd ~
mkdir test
cd test

Make the directory for the XML configuration files.

mkdir config

Make the directory for the input histogram ROOT files.

mkdir data

Make the directory for the workspace.

mkdir workspaces

Generate input histograms.

Change to the directory for the input histograms.

cd data

Run some code such as the following. For fun, we will create a simulated data signal at about 126 GeV at about three times the number of events of that which was expected.

Example file: make_test_histograms.c

   // Create the expected signal histogram.
      // Create the function used to describe the signal shape (a simple Gaussian shape).
         TF1 mySignalFunction("mySignalFunction", "(1/sqrt(2*pi*0.5^2))*2.718^(-(x-126)^2/(2*0.5^2))", 120, 130);
      // Create the histogram with 100 bins between 120 and 130 GeV.
         TH1F mySignal("mySignal", "mySignal", 100, 120, 130);
      // Fill the histogram using the signal function.
         mySignal.FillRandom("mySignalFunction", 10000000);
   // Create the background histogram.
      // Create the function used to describe the signal shape (a simple polynomial of the second order).
         TF1 myBackgroundFunction("myBackgroundFunction", "-2.3*10^(-6)*x^2+0.0007*x+0.4", 120, 130);
      // Create the histogram with 100 bins between 120 and 130 GeV.
         TH1F myBackground("myBackground", "myBackground", 100, 120, 130);
      // Fill the histogram using the background function.
         myBackground.FillRandom("myBackgroundFunction", 100000000000);
   // Create the (simulated) data histogram. This histogram represents what one might have as real data.
      // Create the histogram using by combining the signal histogram multiplied by 3 with the background histogram.
         TH1F myData=3*mySignal+myBackground;
      // Set the name of the histogram.
   // Save the histograms created to a ROOT file.
      TFile myFile("test_histograms.root", "RECREATE");

Create the workspace.

There are two approaches which might be used to create the RooFit workspace. The first method uses the hist2workspace program to run on XML configuration files and input histogram files. The second method explicitly builds the workspace using RooFit code. In general, most should opt for the first, XML-based approach. If you wish to understand the process of specifying the specifics of the models you want to create, you might take a look at the second approach.

Option 1: Create the workspace using hist2workspace and XML configuration files.

Change to the directory for the configuration XML files.

cd ~/test/config

Copy the HistFactory XML schema to the configuration directory.

cp $ROOTSYS/etc/HistFactorySchema.dtd .

Create XML configuration files in the configuration directory.

Example file: test_top-level.xml

<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>

<!-- workspace output file prefix -->
<Combination OutputFilePrefix="./workspaces/test_workspace" Mode="comb" >

   <!-- channel XML file(s) -->

   <!-- measurement and bin range -->
   <Measurement Name="datastat" Lumi="1" LumiRelErr="0.037" BinLow="0" BinHigh="99" Mode="comb" ExportOnly="True">


Example file: test_channel.xml

<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>

<!-- channel name and input file -->
<Channel Name="test" InputFile="./data/test_histograms.root" HistoName="">
   <!-- data -->
   <Data HistoName="myData" />

   <!-- signal -->
   <Sample Name="signal" HistoName="mySignal"
      NormalizeByTheory="True" >
      <NormFactor Name="SigXsecOverSM" Val="1.0" Low="0.0" High="100." Const="True" />

   <!-- backgrounds -->
   <!-- background -->
   <Sample Name="background" NormalizeByTheory="True" HistoName="myBackground">


As you can see, the model is very simple. There is the expected signal, the expected background and the actual data (in this case, the data is simulated). The point of interest relates to the signal and the expected data will be compared to the actual data.

Run hist2workspace.

hist2workspace config/test_top-level.xml

There should now be created 4 files in the workspaces directory:


The test_workspace_combined_datastat_model.root file is the ROOT file that contains the combined workspace object (is has constraints, weightings etc. properly incorporated). This is the file you are almost certainly interested in. The test_workspace_test_datastat_model.root file contains a workspace object without proper constraints, weightings etc. I don't know what the other two files are.

Use the ProfileLikelihoodCalculator to calculate the 95% condifence interval on the parameter of interest as specified in the ModelConfig.

Create a C++ program such as the following:

Example file: ProfileLikeliHoodCalculator_confidence_level.cpp

#include <iostream>
#include <fstream>
#include <sstream>
#include <stdio.h>
#include <string.h>
#include <cmath>
#include "TStyle.h"
#include "TROOT.h"
#include "TPluginManager.h"
#include "TSystem.h"
#include "TFile.h"
#include "TGaxis.h"
#include "TCanvas.h"
#include "TH1.h"
#include "TF1.h"
#include "TLine.h"
#include "TSpline.h"
#include "RooAbsData.h"
#include "RooDataHist.h"
#include "RooCategory.h"
#include "RooDataSet.h"
#include "RooRealVar.h"
#include "RooAbsPdf.h"
#include "RooSimultaneous.h"
#include "RooProdPdf.h"
#include "RooNLLVar.h"
#include "RooProfileLL.h"
#include "RooFitResult.h"
#include "RooPlot.h"
#include "RooRandom.h"
#include "RooMinuit.h"
#include "TRandom3.h"
#include "RooWorkspace.h"
#include "RooStats/RooStatsUtils.h"
#include "RooStats/ModelConfig.h"
#include "RooStats/ProfileLikelihoodCalculator.h"
#include "RooStats/LikelihoodInterval.h"
#include "RooStats/LikelihoodIntervalPlot.h"
#include "TStopwatch.h"
using namespace std;
using namespace RooFit;
using namespace RooStats;

int main(){
   // Access the inputs.
      // Open the ROOT workspace file.
         TString myInputFileName = "workspaces/test_workspace_combined_datastat_model.root";
         cout << "Opening file " << myInputFileName << "..." << endl;
         TFile *_file0 = TFile::Open(myInputFileName);
      // Access the workspace.
         cout << "Accessing workspace..." << endl;
         RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
      // Access the ModelConfig
         cout << "Accessing ModelConfig..." << endl;      
         ModelConfig* myModelConfig = (ModelConfig*) myWorkspace->obj("ModelConfig");
      // Access the data.
         cout << "Accessing data..." << endl;
         RooAbsData* myData = myWorkspace->data("obsData");

   // Use the ProfileLikelihoodCalculator to calculate the 95% confidence interval on the parameter of interest as specified in the ModelConfig.
      cout << "Calculating profile likelihood...\n" << endl;
      ProfileLikelihoodCalculator myProfileLikelihood(*myData, *myModelConfig);
      LikelihoodInterval* myConfidenceInterval = myProfileLikelihood.GetInterval();
      // Access the confidence interval on the parameter of interest (POI).
         RooRealVar* myPOI = (RooRealVar*) myModelConfig->GetParametersOfInterest()->first();
   // Print the results.
      cout << "Printing results..." << endl;
      // Print the confidence interval on the POI.
         cout << "\n95% confidence interval on the point of interest " << myPOI->GetName()<<": ["<<
            myConfidenceInterval->LowerLimit(*myPOI) << ", "<<
            myConfidenceInterval->UpperLimit(*myPOI) <<"]\n"<<endl;
   return 0;

Compile this code using a Makefile such as the following:

Example file: Makefile

ProfileLikeliHoodCalculator_confidence_level.cpp : ProfileLikeliHoodCalculator_confidence_level.cpp
   g++ -g -O2 -fPIC -Wno-deprecated  -o ProfileLikeliHoodCalculator_confidence_level.cpp ProfileLikeliHoodCalculator_confidence_level.cpp `root-config --cflags --libs --ldflags` -lHistFactory -lXMLParser -lRooStats -lRooFit -lRooFitCore -lThread -lMinuit -lFoam -lHtml -lMathMore -I$ROOTSYS/include -L$ROOTSYS/lib

Compile the code.



The end result as displayed in the terminal output is the following:

95% confidence interval on the point of interest SigXsecOverSM: [2.99653, 3.00347]

Further information

Links for ROOT

This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback