Difference: HiggsAnalysisAtATLASUsingRooStats (1 vs. 51)

Revision 512016-05-06 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Deleted:
<
<
-- Will Breaden Madden - 2016-01-25
 

Higgs analysis at ATLAS using RooStats

Changed:
<
<
This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.
>
>
This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats. Only the most meager of attempts is made to keep this documentation current.
 
Line: 14 to 12
  RooStats is a project to create statistical tools built on top of the RooFit library, which is a data-modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project develops quickly.
Deleted:
<
<

using ROOT on the Glasgow PPELX network

There are instructions on how to use the different versions of ROOT at Glasgow here.

Execute commands something like the following in order to set up ROOT version 5.32.00 on PPELX:

%CODE{"bash"}% export ROOTSYS=/data/ppe01/sl5x/x86_64/root/5.32.00 export PATH=$ROOTSYS/bin:$PATH export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH %ENDCODE%

using ROOT on the CERN LXPLUS network

Execute commands something like the following in order to set up ROOT version 5.32.00 on LXPLUS:

%CODE{"bash"}% . /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh cd /afs/cern.ch/sw/lcg/app/releases/ROOT/5.32.00/x86_64-slc5-gcc43-opt/root/ . bin/thisroot.sh %ENDCODE%

 

setting up RooStats

There are three main options available for acquiring ROOT with RooStats included.

Line: 208 to 184
  %CODE{"c++"}% // general form for defining a RooFit variable:
Changed:
<
<
RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
>
>
RooRealVar x(, , , , )
 // specific example for defining a RooFit variable x with the value 5: RooRealVar x("x", "x observable", 5, -10, 10) %ENDCODE%
Line: 225 to 201
  What is the value of this? In a nutshell, it allows one to do Bayesian stuff very easily.
Changed:
<
<
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^{2}. A Jacobian factor is picked up in going from x to x^{2}, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
>
>
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^{2}. A Jacobian factor is picked up in going from x to x^{2}, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
 

example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

Line: 240 to 216
  // Plot the PDF. RooPlot* xframe = x.frame(); gauss.plotOn(xframe);
Changed:
<
<
xframe->Draw();
>
>
xframe->Draw();
 } %ENDCODE%
Line: 284 to 260
 %CODE{"c++"}% RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ;
Changed:
<
<
myFrame->Draw()
>
>
myFrame->Draw()
 %ENDCODE%

importing data from ROOT trees (how to populate RooDataSets from TTrees)
Line: 306 to 282
  // Access the file. TFile* myFile = new TFile("myFile.root"); // Load the histogram.
Changed:
<
<
TH1* myHistogram = (TH1*) myFile->Get("myHistogram");
>
>
TH1* myHistogram = (TH1*) myFile->Get("myHistogram");
  // Draw the loaded histogram. myHistogram.Draw();
Line: 351 to 327
 %CODE{"c++"}% // Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace");
Changed:
<
<
myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
>
>
myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
 %ENDCODE%

What's in the RooFit workspace?

Line: 362 to 338
 // Open the appropriate ROOT file. root -l myFile.root // Import the workspace.
Changed:
<
<
myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace");
>
>
myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace");
 // Print the workspace contents. myWorkspace.Print(); // Example printout:
Line: 478 to 454
 // myFileName = "BR5_MSSM_signal90_combined_datastat_model.root" // TFile *myFile = TFile::Open(myFileName); // Import the workspace.
Changed:
<
<
RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
>
>
RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
 // Print the workspace contents.
Changed:
<
<
myWorkspace->Print();
>
>
myWorkspace->Print();
 // Import the PDF.
Changed:
<
<
RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90");
>
>
RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90");
 // Import the variable representing the observable.
Changed:
<
<
RooRealVar* myObservable = myWorkspace->var("obs");
>
>
RooRealVar* myObservable = myWorkspace->var("obs");
 // Create a RooPlot frame using the imported variable.. RooPlot* myFrame = myObservable.frame(); // Plot the PDF on the created RooPlot frame. myPDF.plotOn(myFrame); // Draw the RooPlot.
Changed:
<
<
myFrame->Draw();
>
>
myFrame->Draw();
 %ENDCODE%

example code: accessing both data and PDF from a workspace stored in a file
Line: 502 to 478
 TFile myFile("myResults.root") ; RooWorkspace* myWorkspace = f.Get("myWorkspace") ; // Plot the data and PDF
Changed:
<
<
RooPlot* xframe = w->var("x")->frame() ; w->data("d")->plotOn(xframe) ; w->pdf("g")->plotOn(xframe) ;
>
>
RooPlot* xframe = w->var("x")->frame() ; w->data("d")->plotOn(xframe) ; w->pdf("g")->plotOn(xframe) ;
 // Construct a likelihood and profile likelihood
Changed:
<
<
RooNLLVar nll("nll","nll",*myWorkspace->pdf("g"),*w->data("d")) ; RooProfileLL pll("pll","pll", nll,*myWorkspace->var("m")) ; RooPlot* myFrame = w->var("m")->frame(-1,1) ;
>
>
RooNLLVar nll("nll","nll",*myWorkspace->pdf("g"),*w->data("d")) ; RooProfileLL pll("pll","pll", nll,*myWorkspace->var("m")) ; RooPlot* myFrame = w->var("m")->frame(-1,1) ;
 pll.plotOn(myFrame) ;
Changed:
<
<
myFrame->Draw()
>
>
myFrame->Draw()
 %ENDCODE%

links for RooFit

Line: 535 to 511
  // A 95% confidence interval test is run using the ProfileLikelihoodCalculator of RooStats.

// Define a RooFit random seed in order to produce reproducible results.

Changed:
<
<
RooRandom::randomGenerator()->SetSeed(271);
>
>
RooRandom::randomGenerator()->SetSeed(271);
  // Make a simple model using the Workspace Factory.

// Create a new workspace. RooWorkspace* myWorkspace = new RooWorkspace(); // Create the PDF G(x|mu,1) and the variables x, mu and sigma in one command using the factory syntax.

Changed:
<
<
myWorkspace->factory("Gaussian::normal(x[-10,10], mu[-1,1], sigma[1])");
>
>
myWorkspace->factory("Gaussian::normal(x[-10,10], mu[-1,1], sigma[1])");
  // Define parameter sets for observables and parameters of interest.
Changed:
<
<
myWorkspace->defineSet("poi","mu"); myWorkspace->defineSet("obs","x");
>
>
myWorkspace->defineSet("poi","mu"); myWorkspace->defineSet("obs","x");
  // Print the workspace contents.
Changed:
<
<
myWorkspace->Print() ;
>
>
myWorkspace->Print() ;
  // Specify for the statistical tools the components of the defined model. // Create a new ModelConfig. ModelConfig* myModelConfig = new ModelConfig("my G(x|mu,1)"); // Specify the workspace.
Changed:
<
<
myModelConfig->SetWorkspace(*myWorkspace);
>
>
myModelConfig->SetWorkspace(*myWorkspace);
  // Specify the PDF.
Changed:
<
<
myModelConfig->SetPdf(*myWorkspace->pdf("normal"));
>
>
myModelConfig->SetPdf(*myWorkspace->pdf("normal"));
  // Specify the parameters of interest.
Changed:
<
<
myModelConfig->SetParametersOfInterest(*myWorkspace->set("poi"));
>
>
myModelConfig->SetParametersOfInterest(*myWorkspace->set("poi"));
  // Specify the observables.
Changed:
<
<
myModelConfig->SetObservables(*myWorkspace->set("obs"));
>
>
myModelConfig->SetObservables(*myWorkspace->set("obs"));
  // Create a toy dataset.

// Create a toy dataset of 100 measurements of the observables (x).

Changed:
<
<
RooDataSet* myData = myWorkspace->pdf("normal")->generate(*myWorkspace->set("obs"), 100); //myData->print();
>
>
RooDataSet* myData = myWorkspace->pdf("normal")->generate(*myWorkspace->set("obs"), 100); //myData->print();
  // Use the ProfileLikelihoodCalculator to obtain a 95% confidence interval.
Line: 580 to 556
  LikelihoodInterval* myProfileLikelihoodInterval = myProfileLikelihoodCalculator.GetInterval(); // Use this interval result. In this case, it makes sense to say what the lower and upper limits are. // Define the object variables for the purposes of the confidence interval.
Changed:
<
<
RooRealVar* x = myWorkspace->var("x"); RooRealVar* mu = myWorkspace->var("mu"); cout << "The profile likelihood calculator interval is [ "<< myProfileLikelihoodInterval->LowerLimit(*mu) << ", " << myProfileLikelihoodInterval->UpperLimit(*mu) << "] " << endl;
>
>
RooRealVar* x = myWorkspace->var("x"); RooRealVar* mu = myWorkspace->var("mu"); cout << "The profile likelihood calculator interval is [ "<< myProfileLikelihoodInterval->LowerLimit(*mu) << ", " << myProfileLikelihoodInterval->UpperLimit(*mu) << "] " << endl;
  // Set mu equal to zero.
Changed:
<
<
mu->setVal(0);
>
>
mu->setVal(0);
  // Is mu in the interval?
Changed:
<
<
cout << "Is mu = 0 in the interval?" << endl; if (myProfileLikelihoodInterval->IsInInterval(*mu) == 1){ cout << "Yes" << endl;
>
>
cout << "Is mu = 0 in the interval?" << endl; if (myProfileLikelihoodInterval->IsInInterval(*mu) == 1){ cout << "Yes" << endl;
  } else{
Changed:
<
<
cout << "No" << endl;
>
>
cout << "No" << endl;
  } } %ENDCODE%
Line: 996 to 972
 
example (early in project development): creation of the Measurement and a Channel and, thence, creation of the channel samples, including signal and backgrounds

%CODE{"c++"}%

Changed:
<
<
// Create the Measurement and a channel
>
>
// Create the Measurement and a channel.
  std::string myInputFile = "./data/myData.root"; std::string myChannel1Path = "";
Line: 1127 to 1103
 
details on the object measurement
Changed:
<
<
A measurement has several methods to configure its options, each of which are equivalent to their XML equivalents.
>
>
A measurement has several methods to configure its options, each of which is equivalent to their XML equivalents.
 
objective code
set the prefix for output files void SetOutputFilePrefix(const std::string& prefix);
Line: 1192 to 1168
 void AddShapeSys(std::string Name, Constraint::Type ConstraintType, std::string HistoName, std::string HistoFile, std::string HistoPath=""); %ENDCODE%
Changed:
<
<
A sample can be included in a channel's bin-by-bin statistical uncertainty fluctuations by "activating" the sample. There are two ways to do this. The first way is to use the default errors that are stored in the histogram's uncertainty array. The second way is to supply the errors using an external histogram (in the case where the desired errors differ from those stored by the HT1 histogram). These can be achieved using thw following methods:
>
>
A sample can be included in a channel's bin-by-bin statistical uncertainty fluctuations by "activating" the sample. There are two ways to do this. The first way is to use the default errors that are stored in the histogram's uncertainty array. The second way is to supply the errors using an external histogram (in the case where the desired errors differ from those stored by the HT1 histogram). These can be achieved using the following methods:
  %CODE{"c++"}% void ActivateStatError();
Line: 1228 to 1204
 

links for HistFactory

Changed:
<
<
>
>
 

Revision 502016-01-25 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- Will Breaden Madden - 2015-07-15
>
>
-- Will Breaden Madden - 2016-01-25
 
Line: 440 to 440
 
Model Inspector
Changed:
<
<
The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:
>
>
The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and its parameters are as follows:
  %CODE{"c++"}% void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")

Revision 492015-07-15 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- Will Breaden Madden - 2015-06-18
>
>
-- Will Breaden Madden - 2015-07-15
 
Line: 1281 to 1281
  Run some code such as the following. For fun, we will create a simulated data signal at about 126 GeV at about three times the number of events of that which was expected.
Changed:
<
<
example file: make_test_histograms.c
>
>
example file: make_test_histograms.cxx
  %CODE{"c++"}%
Changed:
<
<
{
>
>
#include #include #include #include <stdio.h> #include <string.h> #include #include "TStyle.h" #include "TROOT.h" #include "TPluginManager.h" #include "TSystem.h" #include "TFile.h" #include "TGaxis.h" #include "TCanvas.h" #include "TH1.h" #include "TF1.h" #include "TLine.h" #include "TSpline.h" #include "RooAbsData.h" #include "RooDataHist.h" #include "RooCategory.h" #include "RooDataSet.h" #include "RooRealVar.h" #include "RooAbsPdf.h" #include "RooSimultaneous.h" #include "RooProdPdf.h" #include "RooNLLVar.h" #include "RooProfileLL.h" #include "RooFitResult.h" #include "RooPlot.h" #include "RooRandom.h" #include "RooMinuit.h" #include "TRandom3.h" #include "RooWorkspace.h" #include "RooStats/RooStatsUtils.h" #include "RooStats/ModelConfig.h" #include "RooStats/ProfileLikelihoodCalculator.h" #include "RooStats/LikelihoodInterval.h" #include "RooStats/LikelihoodIntervalPlot.h" #include "TStopwatch.h" using namespace std; using namespace RooFit; using namespace RooStats;

int main() {

  // Create the expected signal histogram.
Added:
>
>
  // Create the function used to describe the signal shape (a simple Gaussian shape). TF1 mySignalFunction("mySignalFunction", "(1/sqrt(2*pi*0.5^2))*2.718^(-(x-126)^2/(2*0.5^2))", 120, 130); // Create the histogram with 100 bins between 120 and 130 GeV.
Line: 1299 to 1346
  TH1F myBackground("myBackground", "myBackground", 100, 120, 130); // Fill the histogram using the background function. myBackground.FillRandom("myBackgroundFunction", 100000000000);
Added:
>
>
  // Create the (simulated) data histogram. This histogram represents what one might have as real data.
Added:
>
>
  // Create the histogram using by combining the signal histogram multiplied by 3 with the background histogram. TH1F myData=3*mySignal+myBackground; // Set the name of the histogram.
Changed:
<
<
myData->SetName("myData");
>
>
myData.SetName("myData");
  // Save the histograms created to a ROOT file. TFile myFile("test_histograms.root", "RECREATE");
Changed:
<
<
mySignal->Write(); myBackground->Write(); myData->Write();
>
>
mySignal.Write(); myBackground.Write(); myData.Write();
  myFile.Close();
Added:
>
>
return 0;
 } %ENDCODE%
Line: 1339 to 1390
 

<-- workspace output file prefix -->
Changed:
<
<
OutputFilePrefix="./workspaces/test_workspace" Mode="comb" >
>
>
OutputFilePrefix="./workspaces/test_workspace">
 
<-- channel XML file(s) -->
./config/test_channel.xml
Line: 1400 to 1451
  Create a C++ program such as the following:
Changed:
<
<
example file: ProfileLikeliHoodCalculator_confidence_level.cpp
>
>
example file: ProfileLikeliHoodCalculator_confidence_level.cxx
  %CODE{"c++"}% #include
Line: 1447 to 1498
 using namespace RooStats;

int main(){

Changed:
<
<
// Access the inputs.
>
>
  // Open the ROOT workspace file. TString myInputFileName = "workspaces/test_workspace_combined_datastat_model.root";
Changed:
<
<
cout << "Opening file " << myInputFileName << "..." << endl;
>
>
cout << "open file " << myInputFileName << endl;
  TFile *_file0 = TFile::Open(myInputFileName); // Access the workspace.
Changed:
<
<
cout << "Accessing workspace..." << endl;
>
>
cout << "access workspace" << endl;
  RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
Changed:
<
<
// Access the ModelConfig cout << "Accessing ModelConfig..." << endl;
>
>
// Access the ModelConfig. cout << "access ModelConfig..." << endl;
  ModelConfig* myModelConfig = (ModelConfig*) myWorkspace->obj("ModelConfig"); // Access the data.
Changed:
<
<
cout << "Accessing data..." << endl;
>
>
cout << "accessing data" << endl;
  RooAbsData* myData = myWorkspace->data("obsData");
Changed:
<
<
// Use the ProfileLikelihoodCalculator to calculate the 95% confidence interval on the parameter of interest as specified in the ModelConfig. cout << "Calculating profile likelihood...\n" << endl;
>
>
// Use the ProfileLikelihoodCalculator to calculate the 95% confidence // interval on the parameter of interest as specified in the ModelConfig. cout << "calculate profile likelihood" << endl;
  ProfileLikelihoodCalculator myProfileLikelihood(*myData, *myModelConfig); myProfileLikelihood.SetConfidenceLevel(0.95); LikelihoodInterval* myConfidenceInterval = myProfileLikelihood.GetInterval(); // Access the confidence interval on the parameter of interest (POI). RooRealVar* myPOI = (RooRealVar*) myModelConfig->GetParametersOfInterest()->first();
Changed:
<
<
// Print the results. cout << "Printing results..." << endl;
>
>
// Print results. cout << "print results" << endl;
  // Print the confidence interval on the POI.
Changed:
<
<
cout << "\n95% confidence interval on the point of interest " << myPOI->GetName()<<": ["<<
>
>
cout << "\n95% confidence interval on the point of interest " << myPOI->GetName()<<": ["<<
  myConfidenceInterval->LowerLimit(*myPOI) << ", "<< myConfidenceInterval->UpperLimit(*myPOI) <<"]\n"<<endl; return 0;
Line: 1485 to 1539
 
example file: Makefile

%CODE{"bash"}%

Changed:
<
<
ProfileLikeliHoodCalculator_confidence_level.cpp : ProfileLikeliHoodCalculator_confidence_level.cpp g++ -g -O2 -fPIC -Wno-deprecated -o ProfileLikeliHoodCalculator_confidence_level.cpp ProfileLikeliHoodCalculator_confidence_level.cpp `root-config --cflags --libs --ldflags` -lHistFactory -lXMLParser -lRooStats -lRooFit -lRooFitCore -lThread -lMinuit -lFoam -lHtml -lMathMore -I$ROOTSYS/include -L$ROOTSYS/lib
>
>
ProfileLikeliHoodCalculator_confidence_level : ProfileLikeliHoodCalculator_confidence_level.cxx g++ -g -O2 -fPIC -Wno-deprecated -o ProfileLikeliHoodCalculator_confidence_level ProfileLikeliHoodCalculator_confidence_level.cxx `root-config --cflags --libs --ldflags` -lHistFactory -lXMLParser -lRooStats -lRooFit -lRooFitCore -lThread -lMinuit -lFoam -lHtml -lMathMore
 %ENDCODE%

Compile the code.

Revision 482015-06-18 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- Will Breaden Madden - 2015-06-16
>
>
-- Will Breaden Madden - 2015-06-18
 
Line: 769 to 769
 HistoNameHigh="myShapeSystematic_1_high" HistoNameLow="myShapeSystematic_1_low" />
Added:
>
>
Note that the order in which the various classes of systematics should be specified is overall systematics followed by shape systematics.
 
example file: $ROOTSYS/tutorials/histfactory/example_channel.xml

%CODE{"html"}%

Revision 472015-06-16 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- Will Breaden Madden - 2015-06-16
Line: 857 to 857
  HistoName="data" />

<-- signal -->
Changed:
<
<
<Sample Name="signal" HistoName="ttH_m110" NormalizeByTheory="False" >
>
>
HistoName="ttH_m110" NormalizeByTheory="False" >
 
<-- systematics: -->
<HistoSys Name="Lumi" HistoNameHigh="ttH_m110_sys_Lumi_up"
Line: 929 to 928
 # The #userSet hashtag is used to indicate places in the script where the user might modify # things.
Changed:
<
<
createXMLFiles() # Arguments: originalMass { originalMass=$1 newMass=$2
>
>
createXMLFiles(){

# arguments: originalMass originalMass=${1} newMass=${2}

  # Set/specify the original XML configuration file names. originalChannelXMLFileName=ttH_m${originalMass}_channel.xml originalTopLevelXMLFileName=ttH_m${originalMass}_top-level.xml
Line: 940 to 941
  # Set/specify the new XML configuration file names. newChannelXMLFileName=ttH_m${newMass}_channel.xml newTopLevelXMLFileName=ttH_m${newMass}_top-level.xml
Added:
>
>
  # Duplicate the original XML configuration files while substituting the original mass # points with the new mass points in the new resulting file. # Here, the program sed is used to replace all occurrances of a specified pattern in # a specified file with another specified pattern. # The "s" correspondes to "substitute". # The "g" corresponds to "globally" (all instances in a line).
Changed:
<
<
sed "s/m$originalMass/m$newMass/g" $originalChannelXMLFileName > $newChannelXMLFileName sed "s/m$originalMass/m$newMass/g" $originalTopLevelXMLFileName > $newTopLevelXMLFileName
>
>
sed "s/m$originalMass/m$newMass/g" ${originalChannelXMLFileName} > ${newChannelXMLFileName} sed "s/m$originalMass/m$newMass/g" ${originalTopLevelXMLFileName} > ${newTopLevelXMLFileName}
 }

# Specify the original mass point. originalMass=110 #userSet

# Execute the function for creation of new XML configuration files for all required mass points.

Changed:
<
<
createXMLFiles $originalMass 115 #userSet createXMLFiles $originalMass 120 #userSet createXMLFiles $originalMass 125 #userSet createXMLFiles $originalMass 130 #userSet createXMLFiles $originalMass 140 #userSet
>
>
createXMLFiles ${originalMass} 115 #userSet createXMLFiles ${originalMass} 120 #userSet createXMLFiles ${originalMass} 125 #userSet createXMLFiles ${originalMass} 130 #userSet createXMLFiles ${originalMass} 140 #userSet
 %ENDCODE%

C++ approach to HistFactory

Line: 1219 to 1226
 

links for HistFactory

Changed:
<
<
HistFactory user guide, June 2012 (draft, under development)

HistFactory user guide, March 2012

HistFactory XML reference

XML example

Exotics Working Group statistics tutorial XML reference

Exotics Working Group statistics tutorial workspace examples

early description of C++ approach to HistFactory

description of building HistFactory models using C++ and Python

>
>
 

analysis!

Line: 1362 to 1362
  HistoName="myData" />

<-- signal -->
Changed:
<
<
<Sample Name="signal" HistoName="mySignal" NormalizeByTheory="True" >
>
>
HistoName="mySignal" NormalizeByTheory="True" >
 
Line: 1506 to 1505
 

links for ROOT

Changed:
<
<
ROOT User's Guide
>
>
 

Revision 462015-06-16 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2013-08-27
>
>
-- Will Breaden Madden - 2015-06-16
 
Line: 12 to 12
 

What is RooStats?

Changed:
<
<
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project developes quickly.
>
>
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data-modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project develops quickly.
 

using ROOT on the Glasgow PPELX network

There are instructions on how to use the different versions of ROOT at Glasgow here.

Changed:
<
<
Execute the following commands in order to set up ROOT version 5.32.00 on PPELX:
>
>
Execute commands something like the following in order to set up ROOT version 5.32.00 on PPELX:
  %CODE{"bash"}% export ROOTSYS=/data/ppe01/sl5x/x86_64/root/5.32.00
Line: 28 to 28
 

using ROOT on the CERN LXPLUS network

Changed:
<
<
Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:
>
>
Execute commands something like the following in order to set up ROOT version 5.32.00 on LXPLUS:
  %CODE{"bash"}% . /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh
Line: 53 to 53
 %CODE{"bash"}% #!/bin/bash
Changed:
<
<
# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.
>
>
################################################################################ # This script builds the latest version of ROOT in Ubuntu. Specifically, first # the ROOT prerequisites are installed, then the most common ROOT optional # packages are installed. Next, the latest version of ROOT in the CERN Git # repository is checked out. Finally, ROOT is compiled. After compiling is # complete, ROOT environment variables should be set up as necessary. ################################################################################

echo -e "\nstart ROOT installation\n" read -s -n 1 -p "Press any key to continue." echo

 
Changed:
<
<
# First, the ROOT prerequisites are installed, then, the most common ROOT optional packages are # installed. Next, the latest version of ROOT in the CERN Subversion repository is checked out. # Finally, ROOT is compiled.
>
>
# Specify the time. date
  # Install ROOT prerequisites.
Changed:
<
<
sudo apt-get install subversion sudo apt-get install make sudo apt-get install g++ sudo apt-get install gcc sudo apt-get install binutils sudo apt-get install libx11-dev sudo apt-get install libxpm-dev sudo apt-get install libxft-dev sudo apt-get install libxext-dev
>
>
echo "install ROOT prerequisites..." sudo apt-get -y install subversion sudo apt-get -y install dpkg-dev sudo apt-get -y install make sudo apt-get -y install g++ sudo apt-get -y install gcc sudo apt-get -y install binutils sudo apt-get -y install libx11-dev #sudo apt-get -y install libxpm-dev sudo apt-get -y install libgd2-xpm-dev sudo apt-get -y install libxft-dev sudo apt-get -y install libxext-dev
  # Install optional ROOT packages.
Changed:
<
<
sudo apt-get install gfortran sudo apt-get install ncurses-dev sudo apt-get install libpcre3-dev sudo apt-get install xlibmesa-glu-dev sudo apt-get install libglew1.5-dev sudo apt-get install libftgl-dev sudo apt-get install libmysqlclient-dev sudo apt-get install libfftw3-dev sudo apt-get install cfitsio-dev sudo apt-get install graphviz-dev sudo apt-get install libavahi-compat-libdnssd-dev sudo apt-get install libldap-dev sudo apt-get install python-dev sudo apt-get install libxml2-dev sudo apt-get install libssl-dev sudo apt-get install libgsl0-dev

# Check out the latest ROOT trunk. svn co http://root.cern.ch/svn/root/trunk /usr/local/root

# Configure the build. cd /usr/local/root # Configure for the system architecture and configure to build the libRooFit advanced fitting package as part of the compilation. ./configure linuxx8664gcc --enable-roofit

>
>
echo "install optional ROOT packages..." sudo apt-get -y install gfortran sudo apt-get -y openssl-dev sudo apt-get -y install libssl-dev #sudo apt-get -y install ncurses-dev sudo apt-get -y install libpcre3-dev sudo apt-get -y install xlibmesa-glu-dev sudo apt-get -y install libglew1.5-dev sudo apt-get -y install libftgl-dev sudo apt-get -y install libmysqlclient-dev sudo apt-get -y install libfftw3-dev sudo apt-get -y install cfitsio-dev sudo apt-get -y install graphviz-dev sudo apt-get -y install libavahi-compat-libdnssd-dev #sudo apt-get -y install libldap-dev sudo apt-get -y install libldap2-dev sudo apt-get -y install python-dev sudo apt-get -y install libxml2-dev sudo apt-get -y install libkrb5-dev sudo apt-get -y install libgsl0-dev sudo apt-get -y install libqt4-dev

# Check out the latest ROOT trunk. Save the download in the ~/root directory. # This should take only a moment. echo "check out the latest ROOT trunk..." cd ~/ git clone http://root.cern.ch/git/root.git

# Configure for the compilation. Specifically, the system architecture is # defined and building of the libRooFit advanced fitting package is enabled. cd ~/root while true; do read -p "Specify the computer bit architecture you want to compile ROOT for (64/32): " computerArchitecture if [ "${computerArchitecture}" == "32" ]; then echo "configure ROOT compile for 32 bit computer architecture..." #./configure linux --enable-roofit --enable-minuit2 ./configure linux --enable-roofit --enable-minuit2 --enable-python --with-python-incdir=/usr/include/python2.6 --with-python-libdir=/usr/lib/i386-linux-gnu break elif [ "${computerArchitecture}" == "64" ]; then echo "configure ROOT compile for 64 bit computer architecture..." #./configure linuxx8664gcc --enable-roofit --enable-minuit2 ./configure linuxx8664gcc --enable-roofit --enable-minuit2 --enable-python --with-python-incdir=/usr/include/python2.6 --with-python-libdir=/usr/lib/x86_64-linux-gnu break fi echo "invalid input" done

  # See other possible configurations using the following command: ./configure --help
Added:
>
>
# Specify the time. date

while true; do read -p "Do you want to continue to compile ROOT now? (y/n): " yOrn if [ "$(echo "${yOrn}" | sed 's/\(.*\)/\L\1/')" == "y" ]; then break elif [ "$(echo "${yOrn}" | sed 's/\(.*\)/\L\1/')" == "n" ]; then echo "exit installation script..." exit 0 fi echo "invalid input" done

 # Compile.
Changed:
<
<
make

>
>
echo "compile ROOT..." time make
 
Changed:
<
<
On a MacBook Pro 7, 1 running Ubuntu 11.04, the compilation takes ~ 1 hour. Following the compilation, the ROOT environment variables can be set up. In Ubuntu or Scientific Linux, the following lines could be added to the ~/.bashrc file:

   export ROOTSYS=/usr/local/root
   export PATH=$ROOTSYS/bin:$PATH
   export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH

>
>
# Move ROOT to the install directory (e.g., /usr/local/) and set up ROOT environment variables in the specified configuration file (e.g., /etc/bash.bashrc, ~/.bashrc). installationDirectory="/usr/local" configurationFile="/etc/bash.bashrc" while true; do read -p "Do you want to continue to move ROOT to the directory "${installationDirectory}" and set up the ROOT environment variables in the file "${configurationFile}"? (y/n): " yOrn yOrnLowercase="$(echo "${yOrn}" | sed 's/\(.*\)/\L\1/')" if [ "${yOrnLowercase}" == "y" ]; then break elif [ "${yOrnLowercase}" == "n" ]; then echo "exit installation script..." exit 0 fi echo "invalid input" done echo "move ROOT to the directory "${installationDirectory}"..." sudo mv ~/root "${installationDirectory}"

echo "Set up ROOT environment variables in the file "${configurationFile}"..." echo -e "\n# ROOT environment variables" >> "${configurationFile}" echo "export ROOTSYS="${installationDirectory}"/root" >> "${configurationFile}" echo "export PATH=\$PATH:\$ROOTSYS/bin" >> "${configurationFile}" echo "export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:\$ROOTSYS/lib" >> "${configurationFile}"

# Specify the time. date

echo -e "\nROOT install complete\n"

 %ENDCODE%

option 3: Build the RooStats branch.

Changed:
<
<
Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).

The necessary instructions can be found here.

>
>
The RooStats branch can be built in order to have the latest development of RooStats (that has not yet been incorporated into a ROOT version). Instructions can be found here.
 

RooFit

Line: 120 to 188
  The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.
Changed:
<
<
The core functionality of RooFit is to enable the modelling of 'event data' distributions, in which each event is a discrete occurrence in time and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions is probability density functions (PDFs), F(x;p), that describe the probability density of the distribution of observables x in terms of the function parameter p.
>
>
The core functionality of RooFit is to enable the modelling of 'event data' distributions, in which each event is a discrete occurrence in time and has one or more measured observables associated with it. Experiments of this nature result in datasets of Poisson (or binomial) statistics. The natural modeling language for such distributions is probability density functions (PDFs), F(x;p), that describe the probability density of the distribution of observables x in terms of the function parameter p.
 
Changed:
<
<
In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identifier for the object while the title of an object is a more elaborate description of the object.
>
>
In RooFit, every variable, data point, function and PDF is represented by a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identifier for the object while the title of an object is a more elaborate description of the object.
  Here are a few examples of mathematical concepts that correspond to various RooFit classes:
Line: 147 to 215
 

RooPlot

Changed:
<
<
A RooPlot is essentially an empty frame that is capable of holding anything plotted verses its variable.
>
>
A RooPlot is essentially an empty frame that is capable of holding anything plotted against its variable.
 

PDFs

Changed:
<
<
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one integrating in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of the model. RooFit doesn't know that x is special; x, the mean and sigma are all on equal footing. You can tell RooFit that x is the variable to normalise over.
>
>
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one integrating in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of the model. RooFit doesn't know that x is special; x, the mean and sigma are all on equal footing. You can tell RooFit that x is the variable to normalise over.
  So, RooGaussian has no intrinsic notion of distinction between observables and parameters. The choice of observables (for unit normalisation) is always passed to gauss.getVal().

What is the value of this? In a nutshell, it allows one to do Bayesian stuff very easily.

Changed:
<
<
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. A Jacobian factor is picked up in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
>
>
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^{2}. A Jacobian factor is picked up in going from x to x^{2}, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
 

example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

Line: 181 to 249
 %CODE{"c++"}% // not normalised (i.e., this is not a PDF): gauss.getVal();
Changed:
<
<
// Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1):
>
>
// Hey, RooFit! This is the thing over which you should normalise (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1):
  gauss.getVal(x); // What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1): %ENDCODE%
Line: 190 to 258
 

general description

Changed:
<
<
A dataset is a collection of points in N-dimensional space. In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
>
>
A dataset is a collection of points in N-dimensional space. In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
  In general, working in RooFit with binned and unbinned data is very similar, as both the RooDataSet (for unbinned data) and RooDataHist (for binned data) classes inherit from a common base class, RooAbsData, which defines the interface for a generic abstract data sample. With few exceptions, all RooFit methods take abstract datasets as input arguments, allowing for the interchangeable use of binned and unbinned data.
Line: 225 to 293
 

RooDataHist (binned data)

Changed:
<
<
Importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set) (how to populate RooDataHists from histograms)
>
>
importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set) (how to populate RooDataHists from histograms)
 
Changed:
<
<
In RooFit, binned data is represented by the RooDataHist class. The contents of a ROOT histogram can be imported into a RooDataHist object. In importing a ROOT histogram, the binning of the original histogram is imported as well. A RooDataHist associates the histogram with a RooFit variable object of type RooRealVar. In this way it is always known what kind of data is stored in the histogram.
>
>
In RooFit, binned data is represented by the RooDataHist class. The contents of a ROOT histogram can be imported to a RooDataHist object. In importing a ROOT histogram, the binning of the original histogram is imported as well. A RooDataHist associates the histogram with a RooFit variable object of type RooRealVar. In this way it is always known what kind of data is stored in the histogram.
  In displaying the data, RooFit, by default, shows the 68% confidence interval for Poisson statistics.
Line: 259 to 327
 

fitting a model to data

Changed:
<
<
Fitting a model to data can be done in many ways. The most common methods are the χ2 fit and the -log(L) fit. The default fitting method in ROOT is the χ2 method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.
>
>
Fitting a model to data can be done in many ways. The most common methods are the \chi^{2} fit and the -log(L) fit. The default fitting method in ROOT is the \chi^{2} method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.
 

fitting a PDF to unbinned data

Line: 334 to 402
  RooRealVar mu("mu", "mu", 150); RooRealVar sigma("sigma", "sigma", 5, 0, 20); RooGaussian myGaussianPDF("myGaussianPDF", "Gaussian PDF", x, mu, sigma);
Added:
>
>
  // Create a Graphviz DOT file with a representation of the object tree. myGaussianPDF.graphVizTree("myGaussianPDFTree.dot"); // This produced DOT file can be converted to some graphical representation:
Changed:
<
<
// Convert the DOT file to a 'top-to-bottom graph' using UNIX commands:
>
>
// - Convert the DOT file to a 'top-to-bottom graph' using UNIX commands:
  // dot -Tgif -o myGaussianPDF_top-to-bottom_graph.gif myGaussianPDFTree.dot
Changed:
<
<
// Convert the DOT file to a 'spring-model graph' using UNIX commands:
>
>
// - Convert the DOT file to a 'spring-model graph' using UNIX commands:
  // fdp -Tgif -o myGaussianPDF_spring-model_graph.gif myGaussianPDFTree.dot
Added:
>
>
  // Print the PDF contents. myGaussianPDF.Print();
Changed:
<
<
// Example output:
>
>
// example output:
  // RooGaussian::G[ x=x mean=mu sigma=sigma ] = 1 // sigma.Print() // RooRealVar::sigma = 5 L(0 - 20)
Line: 347 to 417
  // RooGaussian::G[ x=x mean=mu sigma=sigma ] = 1 // sigma.Print() // RooRealVar::sigma = 5 L(0 - 20)
Added:
>
>
  // Print the PDF contents in a detailed manner. myGaussianPDF.Print("verbose");
Added:
>
>
  // Print the PDF contents to stdout. myGaussianPDF.Print("t");
Changed:
<
<
// Example output:
>
>
// example output:
  // 0x166eab0 RooGaussian::G = 1 [Auto] // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150
Line: 356 to 428
  // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5
Added:
>
>
  // Print the PDF contents to a file. myGaussianPDF.printCompactTree("", "myGaussianPDFTree.txt")
Changed:
<
<
// Example output file contents:
>
>
// example output file contents:
  // 0x166eab0 RooGaussian::G = 1 [Auto] // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150
Line: 392 to 465
 
links
Changed:
<
<
Video illustrating the usage of the Model Inspector
>
>
 

accessing the RooFit workspace

Line: 402 to 475
  // Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root // Alternatively, you could open the file in a manner such as the following:
Changed:
<
<
myFileName = "BR5_MSSM_signal90_combined_datastat_model.root" TFile *myFile = TFile::Open(myFileName);
>
>
// myFileName = "BR5_MSSM_signal90_combined_datastat_model.root" // TFile *myFile = TFile::Open(myFileName);
  // Import the workspace. RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined"); // Print the workspace contents.
Line: 442 to 515
 

links for RooFit

Changed:
<
<
user's manual

tutorials

>
>
 

RooStats

Line: 520 to 599
 

links for RooStats

Changed:
<
<
wiki

RooStats user's guide

tutorials

E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch

E-mail support for software issues, bugs etc.: roostats-development@cern.ch

>
>
 

ModelConfig

Revision 452013-12-17 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2013-08-27
Line: 16 to 16
 

using ROOT on the Glasgow PPELX network

Changed:
<
<
There are instructions on how to use the different versions of ROOT at Glasgow here.
>
>
There are instructions on how to use the different versions of ROOT at Glasgow here.
  Execute the following commands in order to set up ROOT version 5.32.00 on PPELX:
Line: 850 to 850
 #!/bin/bash

# This script is designed for use with the naming conventions described in the following page:

Changed:
<
<
# https://ppes8.physics.gla.ac.uk/twiki/bin/view/ATLAS/HiggsAnalysisAtATLASUsingRooStats
>
>
# ATLAS/HiggsAnalysisAtATLASUsingRooStats
 # The #userSet hashtag is used to indicate places in the script where the user might modify # things.

Revision 442013-08-27 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2013-08-05
>
>
-- WilliamBreadenMadden - 2013-08-27
 
Line: 695 to 695
 

example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
Changed:
<
<
<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>
>
>
%CODE{"html"}%
  InputFile="./data/example.root" HistoName="" > HistoName="data" HistoPath="" />
Line: 712 to 713
 
<-- HistoPathHigh="" HistoPathLow="histForSyst4"/>-->
Changed:
<
<
>
>
%ENDCODE%
 
caveats
Line: 721 to 721
  A slash must be added to the end of the HistoPath string attribute of the Channel, Data and Sample tags when referencing directories other than the root directory in ROOT files, in such a manner as follows:
Changed:
<
<


>
>
%CODE{"html"}%
 HistoName="myDataHistogram" HistoPath="myDirectory/" />
Changed:
<
<
>
>
%ENDCODE%
 
colon characters in Name attributes
Line: 775 to 772
  example file: ttH_m110_channel.xml
Changed:
<
<
<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>
>
>
%CODE{"html"}%
 
<-- channel name and input file -->
InputFile="data/ttH_histograms.root" HistoName="">
Line: 820 to 817
 

Changed:
<
<
>
>
%ENDCODE%
  example file: ttH_m110_top-level.xml
Changed:
<
<
<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>
>
>
%CODE{"html"}%
 
<-- workspace output file prefix -->
OutputFilePrefix="workspaces/ttH_m110_workspace" Mode="comb" >
Line: 839 to 836
 

Changed:
<
<
>
>
%ENDCODE%
 
create XML configuration files automatically
Line: 1066 to 1063
  %CODE{"c++"}% void AddPreprocessFunction(std::string name, std::string expression, std::string dependencies);
Changed:
<
<
>
>
%ENDCODE%
 
create a class representing a preprocess function and add it to a measurement directly using the constructor PreprocessFunction and a method of a measurement object
Changed:
<
<

>
>
%CODE{"c++"}%
 PreprocessFunction::PreprocessFunction(std::string Name, std::string Expression, std::string Dependents); void AddPreprocessFunction(const std::string& function); %ENDCODE%
Line: 1172 to 1170
  Make the main directory.
Changed:
<
<
>
>
%CODE{"bash"}%
 cd ~ mkdir test cd test
Changed:
<
<
>
>
%ENDCODE%
  Make the directory for the XML configuration files.
Changed:
<
<
>
>
%CODE{"bash"}%
 mkdir config
Changed:
<
<
>
>
%ENDCODE%
  Make the directory for the input histogram ROOT files.
Changed:
<
<
>
>
%CODE{"bash"}%
 mkdir data
Changed:
<
<
>
>
%ENDCODE%
  Make the directory for the workspace.
Changed:
<
<
>
>
%CODE{"bash"}%
 mkdir workspaces
Changed:
<
<
>
>
%ENDCODE%
 

Generate input histograms.

Change to the directory for the input histograms.

Changed:
<
<
>
>
%CODE{"bash"}%
 cd data
Changed:
<
<
>
>
%ENDCODE%
  Run some code such as the following. For fun, we will create a simulated data signal at about 126 GeV at about three times the number of events of that which was expected.

example file: make_test_histograms.c
Changed:
<
<

>
>
%CODE{"c++"}%
 { // Create the expected signal histogram. // Create the function used to describe the signal shape (a simple Gaussian shape).
Line: 1237 to 1234
  myData->Write(); myFile.Close(); }
Changed:
<
<
>
>
%ENDCODE%
 

Create the workspace.

Line: 1247 to 1244
  Change to the directory for the configuration XML files.
Changed:
<
<
>
>
%CODE{"bash"}%
 cd ~/test/config
Changed:
<
<
>
>
%ENDCODE%
  Copy the HistFactory XML schema to the configuration directory.
Changed:
<
<
>
>
%CODE{"bash"}%
 cp $ROOTSYS/etc/HistFactorySchema.dtd .
Changed:
<
<
>
>
%ENDCODE%
  Create XML configuration files in the configuration directory.

example file: test_top-level.xml
Changed:
<
<

>
>
%CODE{"html"}%
 

<-- workspace output file prefix -->
Line: 1277 to 1273
 

Changed:
<
<
>
>
%ENDCODE%
 
example file: test_channel.xml
Changed:
<
<

>
>
%CODE{"html"}%
 

<-- channel name and input file -->
Line: 1304 to 1299
 

Changed:
<
<
>
>
%ENDCODE%
  As you can see, the model is very simple. There is the expected signal, the expected background and the actual data (in this case, the data is simulated). The point of interest relates to the signal and the expected data will be compared to the actual data.

Run hist2workspace.

Changed:
<
<
>
>
%CODE{"bash"}%
 hist2workspace config/test_top-level.xml
Changed:
<
<
>
>
%ENDCODE%
  There should now be created 4 files in the workspaces directory:
Changed:
<
<
>
>
%CODE{"bash"}%
 test_workspace_combined_datastat_model.root test_workspace_datastat.root test_workspace_results.table test_workspace_test_datastat_model.root
Changed:
<
<
>
>
%ENDCODE%
  The test_workspace_combined_datastat_model.root file is the ROOT file that contains the combined workspace object (is has constraints, weightings etc. properly incorporated). This is the file you are almost certainly interested in. The test_workspace_test_datastat_model.root file contains a workspace object without proper constraints, weightings etc. I don't know what the other two files are.
Line: 1331 to 1326
 
example file: ProfileLikeliHoodCalculator_confidence_level.cpp
Changed:
<
<

>
>
%CODE{"c++"}%
 #include #include #include
Line: 1408 to 1402
  myConfidenceInterval->UpperLimit(*myPOI) <<"]\n"<<endl; return 0; }
Changed:
<
<
>
>
%ENDCODE%
  Compile this code using a Makefile such as the following:

example file: Makefile
Changed:
<
<

>
>
%CODE{"bash"}%
 ProfileLikeliHoodCalculator_confidence_level.cpp : ProfileLikeliHoodCalculator_confidence_level.cpp g++ -g -O2 -fPIC -Wno-deprecated -o ProfileLikeliHoodCalculator_confidence_level.cpp ProfileLikeliHoodCalculator_confidence_level.cpp `root-config --cflags --libs --ldflags` -lHistFactory -lXMLParser -lRooStats -lRooFit -lRooFitCore -lThread -lMinuit -lFoam -lHtml -lMathMore -I$ROOTSYS/include -L$ROOTSYS/lib
Changed:
<
<
>
>
%ENDCODE%
  Compile the code.
Changed:
<
<
>
>
%CODE{"bash"}%
 make
Changed:
<
<
>
>
%ENDCODE%
 

results

The end result as displayed in the terminal output is the following:

Changed:
<
<
>
>
%CODE{"bash"}%
 95% confidence interval on the point of interest SigXsecOverSM: [2.99653, 3.00347]
Changed:
<
<
>
>
%ENDCODE%
 

further information

Revision 432013-08-07 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2013-08-05
Line: 20 to 20
  Execute the following commands in order to set up ROOT version 5.32.00 on PPELX:
Changed:
<
<
>
>
%CODE{"bash"}%
 export ROOTSYS=/data/ppe01/sl5x/x86_64/root/5.32.00 export PATH=$ROOTSYS/bin:$PATH export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH
Changed:
<
<
>
>
%ENDCODE%
 

using ROOT on the CERN LXPLUS network

Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:

Changed:
<
<
>
>
%CODE{"bash"}%
 . /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh cd /afs/cern.ch/sw/lcg/app/releases/ROOT/5.32.00/x86_64-slc5-gcc43-opt/root/ . bin/thisroot.sh
Changed:
<
<
>
>
%ENDCODE%
 

setting up RooStats

Line: 49 to 49
 Follow the appropriate instructions here to build the ROOT trunk.

shell script: building ROOT with RooFit and RooStats

Changed:
<
<

>
>
%CODE{"bash"}%
 #!/bin/bash

# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.

Line: 105 to 106
  export ROOTSYS=/usr/local/root export PATH=$ROOTSYS/bin:$PATH export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH
Changed:
<
<

>
>
%ENDCODE%
 

option 3: Build the RooStats branch.

Line: 136 to 137
 Composite functions correspond to composite objects. The ArgSet class is dependent on argument order while the ArgList class is not.

example code: defining a RooFit variable

Changed:
<
<

general form for defining a RooFit variable:

>
>
%CODE{"c++"}% // general form for defining a RooFit variable:
  RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
Changed:
<
<
specific example for defining a RooFit variable x with the value 5:
>
>
// specific example for defining a RooFit variable x with the value 5:
  RooRealVar x("x", "x observable", 5, -10, 10)
Changed:
<
<

>
>
%ENDCODE%
 

RooPlot

Line: 158 to 160
 Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. A Jacobian factor is picked up in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.

example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

Changed:
<
<

>
>
%CODE{"c++"}%
  { // Build a Gaussian PDF. RooRealVar x("x", "x", -10, 10);
Line: 171 to 174
  gauss.plotOn(xframe); xframe->Draw(); }
Changed:
<
<

>
>
%ENDCODE%
 

example code: telling a RooFit PDF what to normalise over

Changed:
<
<

not normalised (i.e., this is not a PDF):

>
>
%CODE{"c++"}% // not normalised (i.e., this is not a PDF):
  gauss.getVal();
Changed:
<
<
Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1):
>
>
// Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1):
  gauss.getVal(x);
Changed:
<
<
What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1):

>
>
// What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1): %ENDCODE%
 

datasets

Line: 193 to 197
 

RooDataSet (unbinned data)

example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it
Changed:
<
<

>
>
%CODE{"c++"}%
  } // Create a RooDataSet and fill it with generated toy Monte Carlo data: RooDataSet* myData = gauss.generate(x, 100);
Line: 202 to 207
  myData.plotOn()(myFrame); myFrame.Draw(); }
Changed:
<
<

>
>
%ENDCODE%
  Plotting unbinned data is similar to plotting binned data with the exception that one can display it in some preferred binning.

example code: plotting unbinned data (a RooDataSet) using a specified binning
Changed:
<
<

>
>
%CODE{"c++"}%
  RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ; myFrame->Draw()
Changed:
<
<

>
>
%ENDCODE%
 
importing data from ROOT trees (how to populate RooDataSets from TTrees)
Line: 226 to 232
 In displaying the data, RooFit, by default, shows the 68% confidence interval for Poisson statistics.

example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)
Changed:
<
<

>
>
%CODE{"c++"}%
  { // Access the file. TFile* myFile = new TFile("myFile.root");
Line: 246 to 253
  myData.plotOn(myFrame) myFrame.Draw() }
Changed:
<
<

>
>
%ENDCODE%
 

fitting

Line: 257 to 264
 

fitting a PDF to unbinned data

example code: fit a Gaussian PDF to data
Changed:
<
<

>
>
%CODE{"c++"}%
  // Fit gauss to unbinned data gauss.fitTo(*myData);
Changed:
<
<

>
>
%ENDCODE%
 

The RooFit workspace

Line: 271 to 279
 Consider a Gaussian PDF. One might create this Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian would be drawn in and owned by the workspace (there are no nightmarish ownership problems). Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".

example code: using the Workspace Factory to create a Gaussian PDF

Changed:
<
<

>
>
%CODE{"c++"}%
  // Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace"); myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
Changed:
<
<

>
>
%ENDCODE%
 

What's in the RooFit workspace?

example code: What's in the workspace?

Changed:
<
<

>
>
%CODE{"c++"}%
  // Open the appropriate ROOT file. root -l myFile.root // Import the workspace.
Line: 309 to 319
  RooAbsData* myData = myWorkspace.data("d"); // Import the ModelConfig saved as m. ModelConfig* myModelConfig = (ModelConfig*) myWorkspace.obj("m");
Changed:
<
<

>
>
%ENDCODE%
 

visual representations of the model/PDF contents

Line: 318 to 328
 Graphviz consists of a graph description language called the DOT language and a set of tools that can generate and/or process DOT files.

example code: examining PDFs and creating graphical representations of them
Changed:
<
<

>
>
%CODE{"c++"}%
  // Create variables and a PDF using those variables. RooRealVar mu("mu", "mu", 150); RooRealVar sigma("sigma", "sigma", 5, 0, 20);
Line: 352 to 363
  // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5
Changed:
<
<

>
>
%ENDCODE%
 
Model Inspector

The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:

Changed:
<
<


>
>
%CODE{"c++"}%
 void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")
Changed:
<
<

>
>
%ENDCODE%
  If the workspace(s) were made using hist2workspace, the names have a standard form (as shown above).

using the Model Inspector
Changed:
<
<
>
>
%CODE{"c++"}%
  // Load the macro. root -L ModelInspector.C++ // Run the macro on the appropriate ROOT workspace file. ModelInspector("results/my_combined_example_model.root")
Changed:
<
<
>
>
%ENDCODE%
  The Model Inspector GUI should appear. The GUI consists of a number of plots, corresponding to the various channels in the model, and a few sliders, corresponding to the parameters of interest and the nuisance parameters in the model. The initial plots are based on the values of the parameters in the workspace. There is a little "Fit" button which fits the model to the data points (while also printing the standard terminal output detailing the fitting). After fitting, a yellow band is shown around the best fit model indicating the uncertainty from propagating the uncertainty of the fit through the model. On the plots, there is a red line shown (corresponding to the fit for each of the parameters at their nominal value pushed up by 1 sigma).
Line: 385 to 397
 

accessing the RooFit workspace

example code: accessing the workspace
Changed:
<
<

>
>
%CODE{"c++"}%
  // Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root // Alternatively, you could open the file in a manner such as the following:
Line: 405 to 418
  myPDF.plotOn(myFrame); // Draw the RooPlot. myFrame->Draw();
Changed:
<
<

>
>
%ENDCODE%
 
example code: accessing both data and PDF from a workspace stored in a file
Changed:
<
<

>
>
%CODE{"c++"}%
  // Note that the following code is independent of actual PDF in the file. So, for example, a full Higgs combination could work with identical code.

// Open a file and import the workspace.

Line: 424 to 438
  RooPlot* myFrame = w->var("m")->frame(-1,1) ; pll.plotOn(myFrame) ; myFrame->Draw()
Changed:
<
<

>
>
%ENDCODE%
 

links for RooFit

Line: 441 to 455
 The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.

example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

Changed:
<
<

>
>
%CODE{"c++"}%
  { // In this script, a simple model is created using the Workspace Factory in RooFit. // ModelConfig is used to specify the parts of the model necessary for the statistical tools of RooStats.
Line: 501 to 516
  cout << "No" << endl; } }
Changed:
<
<

>
>
%ENDCODE%
 

links for RooStats

Line: 540 to 554
  The hist2workspace executable is run using the top-level XML configuration file as an argument in the following manner:
Changed:
<
<
>
>
%CODE{"bash"}%
  hist2workspace top_level.xml
Changed:
<
<
>
>
%ENDCODE%
  hist2workspace builds the model and saves input histograms in the output ROOT file. The measurement's configuration class is made to persist as well. This measurement class has a member function that can write XML configuration files which point to the histograms saved in the (output) ROOT file (e.g.: GaussExample->WriteToXML).
Line: 612 to 626
 There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.

example file: $ROOTSYS/tutorials/histfactory/example.xml
Changed:
<
<
<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>
>
>
%CODE{"html"}%
  OutputFilePrefix="./results/example" Mode="comb" >
Line: 643 to 658
 

Changed:
<
<
>
>
%ENDCODE%
 
channel XML files
Line: 835 to 849
  example file: createXMLFiles.sh
Changed:
<
<

>
>
%CODE{"bash"}%
 #!/bin/bash

# This script is designed for use with the naming conventions described in the following page:

Line: 874 to 887
  createXMLFiles $originalMass 125 #userSet createXMLFiles $originalMass 130 #userSet createXMLFiles $originalMass 140 #userSet
Changed:
<
<
>
>
%ENDCODE%
 

C++ approach to HistFactory

Line: 906 to 919
 
example (early in project development): creation of the Measurement and a Channel and, thence, creation of the channel samples, including signal and backgrounds
Changed:
<
<

>
>
%CODE{"c++"}%
 // Create the Measurement and a channel std::string myInputFile = "./data/myData.root"; std::string myChannel1Path = "";
Line: 946 to 958
  myChannel.samples.push_back(myBackground2); // Add the samples to the measurement. myMeasurement.channels.push_back(myChannel);
Changed:
<
<
>
>
%ENDCODE%
 

example HistFactory model construction using C++

A HistFactory model is to be created. It shall consist of a single channel that shall have only signal and background. Uncertainties on the luminosity, statistical uncertainties from Monte Carlo and separate systematics on signal and background shall be included.

Changed:
<
<

>
>
%CODE{"c++"}%
 #include "RooStats/HistFactory/Measurement.h" void MakeSimpleModel() { // This function creates a simple model with one channel.
Line: 1029 to 1040
  // Run the measurement (this is equivalent to an execution of // the program hist2workspace). MakeModelAndMeasurementFast(measurement_1);
Changed:
<
<
>
>
%ENDCODE%
 

details on HistFactory usage in C++

Line: 1053 to 1064
 
add a preprocessed function by giving the function a name, a functional expression and a string with a bracketed list of dependencies (e.g. "SigXsecOverSM[0,3]")
Changed:
<
<

>
>
%CODE{"c++"}%
 void AddPreprocessFunction(std::string name, std::string expression, std::string dependencies);
create a class representing a preprocess function and add it to a measurement directly using the constructor PreprocessFunction and a method of a measurement object
Line: 1062 to 1072
 
PreprocessFunction::PreprocessFunction(std::string Name, std::string Expression, std::string Dependents);
void AddPreprocessFunction(const std::string& function);
Changed:
<
<
>
>
%ENDCODE%
 
details on object channel
Line: 1091 to 1101
  A sample object can have many different types of systematic uncertainties defined.
Changed:
<
<

>
>
%CODE{"c++"}%
 void AddOverallSys(std::string Name, Double_t Low, Double_t High); void AddNormFactor(std::string Name, Double_t Val, Double_t Low, Double_t High, bool Const=false); void AddHistoSys(std::string Name, std::string HistoNameLow, std::string HistoFileLow, std::string HistoPathLow, std::string HistoNameHigh, std::string HistoFileHigh, std::string HistoPathHigh); void AddHistoFactor(std::string Name, std::string HistoNameLow, std::string HistoFileLow, std::string HistoPathLow, std::string HistoNameHigh, std::string HistoFileHigh, std::string HistoPathHigh); void AddShapeFactor(std::string Name); void AddShapeSys(std::string Name, Constraint::Type ConstraintType, std::string HistoName, std::string HistoFile, std::string HistoPath="");
Changed:
<
<
>
>
%ENDCODE%
  A sample can be included in a channel's bin-by-bin statistical uncertainty fluctuations by "activating" the sample. There are two ways to do this. The first way is to use the default errors that are stored in the histogram's uncertainty array. The second way is to supply the errors using an external histogram (in the case where the desired errors differ from those stored by the HT1 histogram). These can be achieved using thw following methods:
Changed:
<
<

>
>
%CODE{"c++"}%
 void ActivateStatError(); void ActivateStatError(std::string HistoName, std::string InputFile, std::string HistoPath="");
Changed:
<
<
>
>
%ENDCODE%
 
combining

Once each channel has been created, filled with data and samples and added to the overall measurement, the analysis can begin. The first step is to generate the RooFit model from the measurement object. This model is stored in a RooFit workspace object. There are two ways to do this. The first way is used by the program hist2workspace. It builds the workspace, fits it, creates output plots and saves the workspace and plots to files.

Changed:
<
<

>
>
%CODE{"c++"}%
 RooWorkspace* MakeModelAndMeasurementFast(RooStats::HistFactory::Measurement& measurement);
Changed:
<
<
>
>
%ENDCODE%
  The second way is to build the workspace in memory and return a pointer to the workspace object.
Changed:
<
<

>
>
%CODE{"c++"}%
 static RooWorkspace* MakeCombinedModel(Measurement& measurement);
Changed:
<
<
>
>
%ENDCODE%
 A workspace can be created for only a single channel of a model:
Changed:
<
<

>
>
%CODE{"c++"}%
 RooWorkspace* MakeSingleChannelModel(Measurement& measurement, Channel& channel);
Changed:
<
<
>
>
%ENDCODE%
  A function can be used to fit a model.
Changed:
<
<

>
>
%CODE{"c++"}%
 void FitModel(RooWorkspace *, std::string data_name="obsData");
Changed:
<
<
>
>
%ENDCODE%
  All of these methods return a pointer to the newly created workspace object. The workspace can be analysed directly, for example using RooStats scripts, or it can be saved to an output file for later analysis or publication.

Revision 422013-08-05 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-05-14
>
>
-- WilliamBreadenMadden - 2013-08-05
 
Line: 14 to 14
  RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project developes quickly.
Changed:
<
<

Using ROOT on the Glasgow PPELX network

>
>

using ROOT on the Glasgow PPELX network

  There are instructions on how to use the different versions of ROOT at Glasgow here.
Line: 26 to 26
 export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH
Changed:
<
<

Using ROOT on the CERN LXPLUS network

>
>

using ROOT on the CERN LXPLUS network

  Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:
Line: 36 to 36
 . bin/thisroot.sh
Changed:
<
<

Setting up RooStats

>
>

setting up RooStats

  There are three main options available for acquiring ROOT with RooStats included.
Changed:
<
<

Option 1: Download the latest ROOT release binaries.

>
>

option 1: Download the latest ROOT release binaries.

  The latest ROOT binaries for various operating systems are accessible here.
Changed:
<
<

Option 2: Build the ROOT trunk from source.

>
>

option 2: Build the ROOT trunk from source.

  Follow the appropriate instructions here to build the ROOT trunk.
Changed:
<
<

Shell script: building ROOT with RooFit and RooStats

>
>

shell script: building ROOT with RooFit and RooStats

 

#!/bin/bash # This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.

Changed:
<
<
# First, the ROOT prerequisites are installed, # then, the most common ROOT optional packages are installed. # Next, the latest version of ROOT in the CERN Subversion repository is checked out.
>
>
# First, the ROOT prerequisites are installed, then, the most common ROOT optional packages are # installed. Next, the latest version of ROOT in the CERN Subversion repository is checked out.
 # Finally, ROOT is compiled.

# Install ROOT prerequisites.

Line: 108 to 107
  export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH

Changed:
<
<

Option 3: Build the RooStats branch.

>
>

option 3: Build the RooStats branch.

  Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).
Line: 116 to 115
 

RooFit

Changed:
<
<

General description

>
>

general description

  The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.
Line: 126 to 125
  Here are a few examples of mathematical concepts that correspond to various RooFit classes:
Changed:
<
<
Mathematical concept RooFit class
>
>
mathematical concept RooFit class
 
variable RooRealVar
function RooAbsReal
PDF RooAbsPdf
Line: 136 to 135
  Composite functions correspond to composite objects. The ArgSet class is dependent on argument order while the ArgList class is not.
Changed:
<
<

Example code: defining a RooFit variable

>
>

example code: defining a RooFit variable

 

Changed:
<
<
General form for defining a RooFit variable:
>
>
general form for defining a RooFit variable:
  RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
Changed:
<
<
Specific example for defining a RooFit variable x with the value 5:
>
>
specific example for defining a RooFit variable x with the value 5:
  RooRealVar x("x", "x observable", 5, -10, 10)

Line: 158 to 157
  Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. A Jacobian factor is picked up in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
Changed:
<
<

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

>
>

example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

 

{ // Build a Gaussian PDF.

Line: 174 to 173
  }

Changed:
<
<

Example code: telling a RooFit PDF what to normalise over

>
>

example code: telling a RooFit PDF what to normalise over

 

Changed:
<
<
Not normalised (i.e., this is not a PDF):
>
>
not normalised (i.e., this is not a PDF):
  gauss.getVal(); Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1): gauss.getVal(x); What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1):

Changed:
<
<

Datasets

>
>

datasets

 
Changed:
<
<

General description

>
>

general description

  A dataset is a collection of points in N-dimensional space. In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
Line: 193 to 192
 

RooDataSet (unbinned data)

Changed:
<
<
Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it
>
>
example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it
 

} // Create a RooDataSet and fill it with generated toy Monte Carlo data:

Line: 207 to 206
  Plotting unbinned data is similar to plotting binned data with the exception that one can display it in some preferred binning.
Changed:
<
<
Example code: plotting unbinned data (a RooDataSet) using a specified binning
>
>
example code: plotting unbinned data (a RooDataSet) using a specified binning
 

RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ; myFrame->Draw()

Changed:
<
<
Importing data from ROOT trees (how to populate RooDataSets from TTrees)
>
>
importing data from ROOT trees (how to populate RooDataSets from TTrees)
  * to be completed *
Line: 226 to 225
  In displaying the data, RooFit, by default, shows the 68% confidence interval for Poisson statistics.
Changed:
<
<
Example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)
>
>
example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)
 

{ // Access the file.

Line: 249 to 248
  }

Changed:
<
<

Fitting

>
>

fitting

 
Changed:
<
<

Fitting a model to data

>
>

fitting a model to data

  Fitting a model to data can be done in many ways. The most common methods are the χ2 fit and the -log(L) fit. The default fitting method in ROOT is the χ2 method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.
Changed:
<
<

Fitting a PDF to unbinned data

>
>

fitting a PDF to unbinned data

 
Changed:
<
<
Example code: fit a Gaussian PDF to data
>
>
example code: fit a Gaussian PDF to data
 

// Fit gauss to unbinned data gauss.fitTo(*myData);

Line: 265 to 264
 

The RooFit workspace

Changed:
<
<

General description

>
>

general description

  The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.

Consider a Gaussian PDF. One might create this Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian would be drawn in and owned by the workspace (there are no nightmarish ownership problems). Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".

Changed:
<
<

Example code: using the Workspace Factory to create a Gaussian PDF

>
>

example code: using the Workspace Factory to create a Gaussian PDF

 

// Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace");

Line: 280 to 279
 

What's in the RooFit workspace?

Changed:
<
<

Example code: What's in the workspace?

>
>

example code: What's in the workspace?

 

// Open the appropriate ROOT file. root -l myFile.root

Line: 312 to 311
  ModelConfig* myModelConfig = (ModelConfig*) myWorkspace.obj("m");

Changed:
<
<

Visual representations of the model/PDF contents

>
>

visual representations of the model/PDF contents

 
Graphviz

Graphviz consists of a graph description language called the DOT language and a set of tools that can generate and/or process DOT files.

Changed:
<
<
Example code: examining PDFs and creating graphical representations of them
>
>
example code: examining PDFs and creating graphical representations of them
 

// Create variables and a PDF using those variables. RooRealVar mu("mu", "mu", 150);

Line: 364 to 363
  If the workspace(s) were made using hist2workspace, the names have a standard form (as shown above).
Changed:
<
<
Using the Model Inspector
>
>
using the Model Inspector
 
   // Load the macro.
Line: 379 to 378
  You can get the ModelInspector.C from the Statistics Forum RooStats tools here.
Changed:
<
<
Links
>
>
links
  Video illustrating the usage of the Model Inspector
Changed:
<
<

Accessing the RooFit workspace

>
>

accessing the RooFit workspace

 
Changed:
<
<
Example code: accessing the workspace
>
>
example code: accessing the workspace
 

// Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root

Line: 408 to 407
  myFrame->Draw();

Changed:
<
<
Example code: accessing both data and PDF from a workspace stored in a file
>
>
example code: accessing both data and PDF from a workspace stored in a file
 

// Note that the following code is independent of actual PDF in the file. So, for example, a full Higgs combination could work with identical code.

Line: 427 to 426
  myFrame->Draw()

Changed:
<
<

Links for RooFit

>
>

links for RooFit

 
Changed:
<
<
User's Manual
>
>
user's manual
 
Changed:
<
<
Tutorials
>
>
tutorials
 

RooStats

Changed:
<
<

General description

>
>

general description

  RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.

Changed:
<
<

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

>
>

example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

 

{ // In this script, a simple model is created using the Workspace Factory in RooFit.

Line: 505 to 504
 

Changed:
<
<

Links for RooStats

>
>

links for RooStats

 
Changed:
<
<
Wiki
>
>
wiki
 
Changed:
<
<
RooStats User's Guide
>
>
RooStats user's guide
 
Changed:
<
<
Tutorials
>
>
tutorials
  E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch
Line: 523 to 522
 

HistFactory

Changed:
<
<

General description

>
>

general description

  The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. There are two main ways to interact with HistFactory: through C++ code and through XML configuration files. In the XML configuration approach, in a nutshell, ROOT files containing input histograms are set up and XML configuration files are set up for those input ROOT files. The XML configuration files specify details on the histograms, how RooFit should interpret the information in the files and how histograms are to be used in calculations. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace.
Line: 549 to 548
 

XML files

Changed:
<
<
General description
>
>
general description
  A minimum of two XML files are required for configuration. The "top-level" XML configuration file defines the measurement and contains a list of channels that contribute to this measurement. The "channel" XML configuration files are used to describe each channel in detail. For each contributing channel, there is a separate XML configuration file.
Changed:
<
<
Conventions
>
>
conventions
  The nominal and variational histograms should all have the same normalisation convention. There are a few conventions possible:
Changed:
<
<
  • Option 1:
>
>
  • option 1:
 
    • Lumi="XXX" is in the main XML's element, where XXX is in fb^-1.
    • Histograms have units of fb/bin.
    • Some samples have NormFactors that are all relative to prediction (e.g. 1 is the nominal prediction).
Changed:
<
<
  • Option 2:
>
>
  • option 2:
 
    • Lumi="1" is in the main XML's element.
    • Histograms are normalized to unity.
    • Each sample has a NormFactor that is the expected numbers of events in data.
Changed:
<
<
  • Option 3:
>
>
  • option 3:
 
    • Lumi="1" is in the main XML's element.
    • Histograms have units of (numbers of events)/bin expected in data.
    • Some samples have NormFactors that are all relative to prediction (e.g. 1 is the nominal prediction).
    • It's up to you. In the end, the expected number is a product: N=Lumi*BinContent*NormFactor(s). See the user's guide for more precise equations.
Changed:
<
<
Top-Level XML file
>
>
top-Level XML file
 
Changed:
<
<
General description
>
>
general description
  This file specifies a top level 'Combination' that is composed of:
  • several 'Channels', which are described in separate XML configuration files.
Line: 608 to 607
  Const: this has a specific value, not set here, but where the param is defined.
Changed:
<
<
Specific instructions
>
>
specific instructions
  There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.
Changed:
<
<
Example file: $ROOTSYS/tutorials/histfactory/example.xml
>
>
example file: $ROOTSYS/tutorials/histfactory/example.xml
 
<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>
Line: 647 to 646
 
Changed:
<
<
Channel XML files
>
>
channel XML files
 
Changed:
<
<
General description
>
>
general description
  These files specify for each channel
  • Name, a name for the channel.
Line: 669 to 668
 
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
Changed:
<
<
Specific instructions
>
>
specific instructions
  First, the channel file specifies the XML schema. Then, the channel is defined and named. The location of the data histogram for the channel is defined. For each background, a Sample is defined. It is specified whether it is normalised to luminosity (i.e., the histograms should be per inverse picobarn and will be scaled). To enable normalisation to luminosity, set the tag attribute NormalizeByTheory to "True". For external normalisation, a data-driven background measurement is fixed to the lumi of the dataset. In this case, set the tag attribute NormalizeByTheory to "False". For the normalisation factor, (e.g., "SigXsecOverSM"), the tag attribute "Name" should match the POI specified in the top level XML configuration file.
Changed:
<
<
Systematic uncertainties
>
>
systematic uncertainties
  For an overall relative rate systematic, the "OverallSys" tag is used with its appropriate tag attributes. For a shape systematic (a systematic that affects the shape of a histogram), the "HistoSys" tag is used with its appropriate tag attributes. Specifically, for the HistoSys tag attributes "HistoNameHigh" and "HistoNameLow", the respective histograms for the upper and lower shape systematic uncertainties are specified in a manner such as the following:
Line: 681 to 680
 HistoNameHigh="myShapeSystematic_1_high" HistoNameLow="myShapeSystematic_1_low" />
Changed:
<
<
Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
>
>
example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
 
<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>
Line: 702 to 701
 
Changed:
<
<
Caveats
>
>
caveats
 
Changed:
<
<
Slash suffix in HistoPath attribute
>
>
slash suffix in HistoPath attribute
  A slash must be added to the end of the HistoPath string attribute of the Channel, Data and Sample tags when referencing directories other than the root directory in ROOT files, in such a manner as follows:
Line: 715 to 714
 
Changed:
<
<
Colon characters in Name attributes
>
>
colon characters in Name attributes
  Sometimes colon characters are unusable in the Name attribute of the Channel, Data and Sample tags. It is wise to avoid using colons in this context.
Changed:
<
<
Guidance in writing the XML configuration files
>
>
guidance in writing the XML configuration files
 
Changed:
<
<
Histograms
>
>
histograms
 
Changed:
<
<
Units
>
>
units
  The units of the input histogram and the luminosity specified in the top-level XML configuration file should be compatible; for example, input histograms might be in \textrm{pb} while the luminosity might be in \textrm{pb}^{-1} or input histograms might be in \textrm{fb} while the luminosity might be in \textrm{fb}^{-1}.
Changed:
<
<
Naming convention
>
>
naming convention
  The input histogram names are of no special significance, however, it is often preferable to have a good naming convention devised. One might consider the order in which one has information in the XML and aim to have histogram names appear in a similar order when listed alphabetically.
Changed:
<
<
Generic example
Type of histogram Histogram naming convention
>
>
generic example
type of histogram histogram naming convention
 
Phenomenon histogram <phenomenon name>_m<mass point>
Phenomenon upward systematic histogram <phenomenon name>_m<mass point>_sys_<systematic name>_up
Phenomenon downward systematic histogram <phenomenon name>_m<mass point>_sys_<systematic name>_do

The reason for having an "m" character preceding the mass point is that it allows for easy search and replace of the mass point value when automatically producing multiple XML configuration files corresponding to multiple mass points.

Changed:
<
<
Specific example
Type of histogram Histogram name
>
>
specific example
type of histogram histogram name
 
ttH histogram ttH_m110
ttH upward luminosity systematic histogram ttH_m110_sys_Lumi_up
ttH downward luminosity systematic histogram ttH_m110_sys_Lumi_do
Line: 751 to 750
 
WW Herwig 105987 upward JES systematic histogram WW_Herwig_105987_m110_sys_JES_up
WW Herwig 105987 downward JES systematic histogram WW_Herwig_105987_m110_sys_JES_do
Changed:
<
<
Samples
>
>
samples
  The sample names are of no special significance, however, it is often preferable to have very short names (e.g., signal1) for the purposes of clarity when, for example, printing the contents of the workspace.
Line: 760 to 759
  The signal(s) are specified as such through the use of the POI tag (in the channel XML configuration files).
Changed:
<
<
Example file: ttH_m110_channel.xml
>
>
example file: ttH_m110_channel.xml
 
<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>
Line: 809 to 808
 
Changed:
<
<
Example file: ttH_m110_top-level.xml
>
>
example file: ttH_m110_top-level.xml
 
<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>
Line: 828 to 827
 
Changed:
<
<
Create XML configuration files automatically
>
>
create XML configuration files automatically
  Once the XML configuration files for a certain mass point are written, these files can be used to automatically create further XML configuration files for all remaining mass points.

If the described naming convention is used (i.e., mass points are prefixed with an "m"), the following shell script might be used to create all required XML configuration files for all mass points.

Changed:
<
<
Example file: createXMLFiles.sh
>
>
example file: createXMLFiles.sh
 

Line: 879 to 878
 

C++ approach to HistFactory

Changed:
<
<
This approach has not been committed to the RooStats branch yet, though it soon will be. There are a few stylistic changes to be made to the latest code, though verification of the code is complete (changes reproduce exactly results from ROOT 5.32). Code examples will be written in due course.

The C++ approach to HistFactory allows one to interact with HistFactory (creating models etc.) using C++ and Python code. The configuration part of HistFactory is based on a C++ class structure. Since the configuration state of HistFactory is based on C++, one can bypass XML. All functions used in the executable hist2workspace are available for use in C++ code. These C++ classes mirror the structure of the input XML tags. These classes are automatically saved in the output ROOT file along with the workspace. They can then be browsed in CINT or imported in code.

>
>
The C++ approach to HistFactory allows one to interact with HistFactory (creating models etc.) using C++ and Python code. The configuration part of HistFactory is based on a C++ class structure. Since the configuration state of HistFactory is based on C++, one can bypass XML. All functions used in the executable hist2workspace are available for use in C++ code. These C++ classes mirror the structure of the input XML tags. These classes are saved automatically in the output ROOT file along with the workspace. They can then be browsed in CINT or imported in code.
  The main advantage of using the C++ approach to HistFactory is that it allows one to easily create many similar models based on an initial template. In the XML approach, one must create many similar XML configuration files and execute hist2workspace many times (though this can be done in an automated manner).
Added:
>
>
Since ROOT version 5.34, HistFactory has had the capability to accept C++ code to build models. Previously, all HistFactory models had to be created using static XML configuration files. Essentially, HistFactory models are created in C++ by defining a node/tree structure of the statistical model. A HistFactory model is a very structured object, consisting of one of several channels, each with data and various samples. Each sample can have different types of systematic uncertainties or can have freely floating parameters. Each of these objects (or nodes of the tree) is represented by a class that can be constructed and added. This C++ approach is relatively new and code examples will be written in due course.
 

HistFactory class tree structure

  • Measurement
Line: 905 to 904
 

HistFactory configuration in C++

Changed:
<
<
Example: creation of the Measurement and a Channel and, thence, creation of the channel samples, including signal and backgrounds
>
>
example (early in project development): creation of the Measurement and a Channel and, thence, creation of the channel samples, including signal and backgrounds
 

Line: 949 to 948
  myMeasurement.channels.push_back(myChannel);
Changed:
<
<

Links for HistFactory

>
>

example HistFactory model construction using C++

A HistFactory model is to be created. It shall consist of a single channel that shall have only signal and background. Uncertainties on the luminosity, statistical uncertainties from Monte Carlo and separate systematics on signal and background shall be included.

#include "RooStats/HistFactory/Measurement.h"
void MakeSimpleModel() {
   // This function creates a simple model with one channel.

   // Create the measurement object.
      RooStats::HistFactory::Measurement measurement_1("measurement_1", "measurement_1");
      // Configure the measurement.
         // Set the output files' prefix.
            measurement_1.SetOutputPrefix("results/measurement_1");
         // Set ExportOnly to false, meaning that further
         // activities (such as fitting and plotting) shall be
         // carried out beyond simple exporting to the workspace.
            measurement_1.SetExportOnly(False);
         // Set the parameter of interest.
            measurement_1.SetPOI("SigXsecOverSM");
         // Set the luminosity.
            // It is assumed that all histograms have been
            // scaled by luminosity.
            measurement_1.SetLumi(1.0);
            // Set the uncertainty.
               measurement_1.SetLumiRelErr(0.10);

      // Create a channel.
         RooStats::HistFactory::Channel channel_1("channel_1");
         // Configure the channel.
            // Set the data.
               // The data is a histogram representing
               // the measured distribution. It can
               // have one or many bins.
               // The name of the ROOT file containing
               // the data and the name to attribute to
               // the data are specified.
               channel_1.SetData("data", "data/example.root");
         // Create a sample (signal).
            // Samples describe the various processes tha
            // are used to model the data. In this case,
            // they consist only of a signal process and a
            // single background process.
            RooStats::HistFactory::Sample signal_1("signal_1", "signal_1", "data/example.root");
            // Configure the sample.
               // The cross section scaling parameter
               // is added.
                  signal_1.AddNormFactor("SigXsecOverSM", 1, 0, 3);
               // A systematic uncertainty of 5% is
               // added.
                  signal_1.AddOverallSys("systematic_1",  0.95, 1.05);
            // Add the sample to the channel.
               channel_1.AddSample(signal_1);
         // Create a sample (background).
            RooStats::HistFactory::Sample background1("background_1", "background_1", "data/example.root");
            // Configure the sample.
               // Add a statistical uncertainty.
                  background_1.ActivateStatError("background_1_statistical_uncertainty", InputFile);
               // A systematic uncertainty of 5% is added.
                  background_1.AddOverallSys("systematic_uncertainty_2", 0.95, 1.05 );
            // Add the sample to the channel.
               chan.AddSample(background_1);
         // Create a sample (background).
            RooStats::HistFactory::Sample background_2("background_2", "background_2", "data/example.root");
               // Add a statistical uncertainty.
                  background2.ActivateStatError();
               // Add a systematic uncertainty.
                  background2.AddOverallSys("systematic_uncertainty_3", 0.95, 1.05 );
            // Add the sample to the channel.
               channel_1.AddSample(background_2);
         // Add the channel to the measurement.
            measurement_1.AddChannel(channel_1);
      // Access the specified data and collect, copy and store the
      // histograms.
         measurement_1.CollectHistograms();
      // Print a text representation of the model.
         measurement_1.PrintTree();
      // Run the measurement (this is equivalent to an execution of
      // the program hist2workspace).
         MakeModelAndMeasurementFast(measurement_1);

details on HistFactory usage in C++

details on the object measurement

A measurement has several methods to configure its options, each of which are equivalent to their XML equivalents.

objective code
set the prefix for output files void SetOutputFilePrefix(const std::string& prefix);
set the parameter of interest for the measurement void SetPOI(const std::string& POI);
set a parameter in the model to be constant void AddConstantParam(const std::string& param);
set the value of a parameter in the model void SetParamValue(const std::string& param, double val);
set the low and high bins for all observables void SetBinLow(int BinLow); void SetBinHigh(int BinHigh);
set the luminosity and its relative error void SetLumi(double Lumi); void SetLumiRelErr(double LumiRelErr);
set whether the model should save plots and tables or should export the workspace void SetExportOnly(bool ExportOnly);
add a channel object to a model (a measurement) void AddChannel(RooStats::HistFactory::Channel chan);
open all specified ROOT files and copy and save all necessary histograms void CollectHistograms();
save a measurement (a model) to a ROOT file (for possible future modification and use in creating new models) void writeToFile(TFile* file);

HistFactory supports parameters that are functions of other parameters. Such parameters are made in the inital mode and then are converted into dynamic functions during a processing pass over the model. Such parameters can be created using additional methods of a measurement object.

add a preprocessed function by giving the function a name, a functional expression and a string with a bracketed list of dependencies (e.g. "SigXsecOverSM[0,3]")

void AddPreprocessFunction(std::string name, std::string expression, std::string dependencies);
create a class representing a preprocess function and add it to a measurement directly using the constructor PreprocessFunction and a method of a measurement object
PreprocessFunction::PreprocessFunction(std::string Name, std::string Expression, std::string Dependents);
void AddPreprocessFunction(const std::string& function);

details on object channel

Measurements contain collections of channels.

objective code
create a channel and give it a name Channel::Channel(const std::string& name);
set the channel histogram using the name and path of a histogram in a ROOT file void SetData(std::string HistoName, std::string InputFile, std::string HistoPath="");
set the value of the single bin of a channel with only one bin (creating a 1 bin histogram) void SetData(double value_1);
create or load a histogram in memory by supplying a pointer to the histogram as the data void SetData(TH1* data_1);
create a HistFactory data object and load it directly (useful in configuring an object and using it multiple times void SetData(const RooStats::HistFactory::Data& data);

details on object sample

Each channel has several samples which describe the data. These samples can represent both signals and backgrounds. Samples are constructed, configured and then added to a channel. Each sample has a histogramdescribing its shape and a list of systematic uncertainties describing how that shape transforms based on a number of parameters.

objective code
create a sample object specifying the name Sample(std::string Name);
create a sample object specifying the name, the histogram name, the histogram file and the histogram path in the file Sample(std::string Name, std::string HistoName, std::string InputFile, std::string HistoPath="");
add a sample object to a channel object void AddSample(RooStats::HistFactory::Sample sample);
independently set a histogram object void SetHisto(TH1* histogram_1);
independently set a value void SetValue(Double_t value_1);
set a sample to be "normalised by theory" (its normalisation scales with luminosity) void SetNormalizeByTheory(bool norm);

systematic uncertainties

A sample object can have many different types of systematic uncertainties defined.

void AddOverallSys(std::string Name, Double_t Low, Double_t High);
void AddNormFactor(std::string Name, Double_t Val, Double_t Low, Double_t High, bool Const=false);
void AddHistoSys(std::string Name, std::string HistoNameLow, std::string HistoFileLow,  std::string HistoPathLow, std::string HistoNameHigh, std::string HistoFileHigh, std::string HistoPathHigh);
void AddHistoFactor(std::string Name, std::string HistoNameLow, std::string HistoFileLow,  std::string HistoPathLow, std::string HistoNameHigh, std::string HistoFileHigh, std::string HistoPathHigh);
void AddShapeFactor(std::string Name);
void AddShapeSys(std::string Name, Constraint::Type ConstraintType, std::string HistoName, std::string HistoFile, std::string HistoPath="");

A sample can be included in a channel's bin-by-bin statistical uncertainty fluctuations by "activating" the sample. There are two ways to do this. The first way is to use the default errors that are stored in the histogram's uncertainty array. The second way is to supply the errors using an external histogram (in the case where the desired errors differ from those stored by the HT1 histogram). These can be achieved using thw following methods:

void ActivateStatError();
void ActivateStatError(std::string HistoName, std::string InputFile, std::string HistoPath="");    

combining

Once each channel has been created, filled with data and samples and added to the overall measurement, the analysis can begin. The first step is to generate the RooFit model from the measurement object. This model is stored in a RooFit workspace object. There are two ways to do this. The first way is used by the program hist2workspace. It builds the workspace, fits it, creates output plots and saves the workspace and plots to files.

RooWorkspace* MakeModelAndMeasurementFast(RooStats::HistFactory::Measurement& measurement);

The second way is to build the workspace in memory and return a pointer to the workspace object.

static RooWorkspace* MakeCombinedModel(Measurement& measurement);

A workspace can be created for only a single channel of a model:

RooWorkspace* MakeSingleChannelModel(Measurement& measurement, Channel& channel);

A function can be used to fit a model.

 
Changed:
<
<
HistFactory manual
>
>
void FitModel(RooWorkspace *, std::string data_name="obsData");

All of these methods return a pointer to the newly created workspace object. The workspace can be analysed directly, for example using RooStats scripts, or it can be saved to an output file for later analysis or publication.

links for HistFactory

HistFactory user guide, June 2012 (draft, under development)

HistFactory user guide, March 2012

  HistFactory XML reference
Line: 961 to 1155
  Exotics Working Group statistics tutorial workspace examples
Changed:
<
<
Early description of C++ approach to HistFactory
>
>
early description of C++ approach to HistFactory

description of building HistFactory models using C++ and Python

 
Changed:
<
<

Analysis!

>
>

analysis!

  ATLAS recommends the use of the profile likelihood as a test statistic.
Changed:
<
<

Full significance calculation example, from histogram creation to model production (using hist2workspace), to RooStats calculations

>
>

full significance calculation example, from histogram creation to model production (using hist2workspace), to RooStats calculations

 

Prepare a working area.

Line: 1007 to 1203
  Run some code such as the following. For fun, we will create a simulated data signal at about 126 GeV at about three times the number of events of that which was expected.
Changed:
<
<
Example file: make_test_histograms.c
>
>
example file: make_test_histograms.c
 

Line: 1044 to 1240
  There are two approaches which might be used to create the RooFit workspace. The first method uses the hist2workspace program to run on XML configuration files and input histogram files. The second method explicitly builds the workspace using RooFit code. In general, most should opt for the first, XML-based approach. If you wish to understand the process of specifying the specifics of the models you want to create, you might take a look at the second approach.
Changed:
<
<
Option 1: Create the workspace using hist2workspace and XML configuration files.
>
>
option 1: Create the workspace using hist2workspace and XML configuration files.
  Change to the directory for the configuration XML files.
Line: 1060 to 1256
  Create XML configuration files in the configuration directory.
Changed:
<
<
Example file: test_top-level.xml
>
>
example file: test_top-level.xml
 

Line: 1080 to 1276
 
Changed:
<
<
Example file: test_channel.xml
>
>
example file: test_channel.xml
 

Line: 1130 to 1326
  Create a C++ program such as the following:
Changed:
<
<
Example file: ProfileLikeliHoodCalculator_confidence_level.cpp
>
>
example file: ProfileLikeliHoodCalculator_confidence_level.cpp
 

Line: 1213 to 1409
  Compile this code using a Makefile such as the following:
Changed:
<
<
Example file: Makefile
>
>
example file: Makefile
 

Line: 1227 to 1423
 make
Changed:
<
<

Results

>
>

results

  The end result as displayed in the terminal output is the following:
Line: 1235 to 1431
 95% confidence interval on the point of interest SigXsecOverSM: [2.99653, 3.00347]
Changed:
<
<

Further information

>
>

further information

 
Changed:
<
<

Links for ROOT

>
>

links for ROOT

  ROOT User's Guide

Revision 412012-05-14 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-03-26
>
>
-- WilliamBreadenMadden - 2012-05-14
 
Line: 702 to 702
 
Added:
>
>
Caveats

Slash suffix in HistoPath attribute

A slash must be added to the end of the HistoPath string attribute of the Channel, Data and Sample tags when referencing directories other than the root directory in ROOT files, in such a manner as follows:


<Data  HistoName="myDataHistogram" HistoPath="myDirectory/" />

Colon characters in Name attributes

Sometimes colon characters are unusable in the Name attribute of the Channel, Data and Sample tags. It is wise to avoid using colons in this context.

 
Guidance in writing the XML configuration files

Histograms

Units

Changed:
<
<
The units of the input histogram and the luminosity specified in the top-level XML configuration file should be compatible; for example, input histograms might be in pb while the luminosity might be in pb^{-1} or input histograms might be in fb while the luminosity might be in fb^{-1}.
>
>
The units of the input histogram and the luminosity specified in the top-level XML configuration file should be compatible; for example, input histograms might be in \textrm{pb} while the luminosity might be in \textrm{pb}^{-1} or input histograms might be in \textrm{fb} while the luminosity might be in \textrm{fb}^{-1}.
  Naming convention
Line: 1109 to 1126
  The test_workspace_combined_datastat_model.root file is the ROOT file that contains the combined workspace object (is has constraints, weightings etc. properly incorporated). This is the file you are almost certainly interested in. The test_workspace_test_datastat_model.root file contains a workspace object without proper constraints, weightings etc. I don't know what the other two files are.
Changed:
<
<

Use the ProfileLikelihoodCalculator to calculate the 95% condifence interval on the parameter of interest as specified in the ModelConfig.

>
>

Use the ProfileLikelihoodCalculator to calculate the 95% confidence interval on the parameter of interest as specified in the ModelConfig.

  Create a C++ program such as the following:

Revision 402012-03-26 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-02-27
>
>
-- WilliamBreadenMadden - 2012-03-26
 
Line: 525 to 525
 

General description

Changed:
<
<
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML configuration files are set up for those input ROOT files. The XML configuration files specify details on the histograms, how RooFit should interpret the information in the files and how histograms are to be used in calculations. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace.
>
>
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. There are two main ways to interact with HistFactory: through C++ code and through XML configuration files. In the XML configuration approach, in a nutshell, ROOT files containing input histograms are set up and XML configuration files are set up for those input ROOT files. The XML configuration files specify details on the histograms, how RooFit should interpret the information in the files and how histograms are to be used in calculations. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace.
 
Changed:
<
<

prepareHistFactory

>
>

XML approach to HistFactory

prepareHistFactory

  The ROOT release ships with a script called prepareHistFactory in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates results/, data/ and config/ directories. It then copies the HistFactorySchema.dtd and example XML files from the ROOT tutorials directory into the config/ directory. It also copies a ROOT file into the data/ directory for use with the example XML configuration files.
Changed:
<
<

HistFactorySchema.dtd

>
>

HistFactorySchema.dtd

  The HistFactorySchema.dtd file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
Changed:
<
<

hist2workspace

>
>

hist2workspace

  The hist2workspace executable is run using the top-level XML configuration file as an argument in the following manner:
Line: 543 to 545
  hist2workspace top_level.xml
Changed:
<
<

XML files

>
>
hist2workspace builds the model and saves input histograms in the output ROOT file. The measurement's configuration class is made to persist as well. This measurement class has a member function that can write XML configuration files which point to the histograms saved in the (output) ROOT file (e.g.: GaussExample->WriteToXML).
 
Changed:
<
<

General description

>
>

XML files

General description
  A minimum of two XML files are required for configuration. The "top-level" XML configuration file defines the measurement and contains a list of channels that contribute to this measurement. The "channel" XML configuration files are used to describe each channel in detail. For each contributing channel, there is a separate XML configuration file.
Changed:
<
<

Conventions

>
>
Conventions
  The nominal and variational histograms should all have the same normalisation convention. There are a few conventions possible:
Line: 569 to 573
 
    • Some samples have NormFactors that are all relative to prediction (e.g. 1 is the nominal prediction).
    • It's up to you. In the end, the expected number is a product: N=Lumi*BinContent*NormFactor(s). See the user's guide for more precise equations.
Changed:
<
<

Top-Level XML file

>
>
Top-Level XML file
 
Changed:
<
<
General description
>
>
General description
  This file specifies a top level 'Combination' that is composed of:
  • several 'Channels', which are described in separate XML configuration files.
Line: 604 to 608
  Const: this has a specific value, not set here, but where the param is defined.
Changed:
<
<
Specific instructions
>
>
Specific instructions
  There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.
Changed:
<
<
Example file: $ROOTSYS/tutorials/histfactory/example.xml
>
>
Example file: $ROOTSYS/tutorials/histfactory/example.xml
 
<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>
Line: 643 to 647
 
Changed:
<
<

Channel XML files

>
>
Channel XML files
 
Changed:
<
<
General description
>
>
General description
  These files specify for each channel
  • Name, a name for the channel.
Line: 665 to 669
 
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
Changed:
<
<
Specific instructions
>
>
Specific instructions
  First, the channel file specifies the XML schema. Then, the channel is defined and named. The location of the data histogram for the channel is defined. For each background, a Sample is defined. It is specified whether it is normalised to luminosity (i.e., the histograms should be per inverse picobarn and will be scaled). To enable normalisation to luminosity, set the tag attribute NormalizeByTheory to "True". For external normalisation, a data-driven background measurement is fixed to the lumi of the dataset. In this case, set the tag attribute NormalizeByTheory to "False". For the normalisation factor, (e.g., "SigXsecOverSM"), the tag attribute "Name" should match the POI specified in the top level XML configuration file.
Changed:
<
<
Systematic uncertainties
>
>
Systematic uncertainties
  For an overall relative rate systematic, the "OverallSys" tag is used with its appropriate tag attributes. For a shape systematic (a systematic that affects the shape of a histogram), the "HistoSys" tag is used with its appropriate tag attributes. Specifically, for the HistoSys tag attributes "HistoNameHigh" and "HistoNameLow", the respective histograms for the upper and lower shape systematic uncertainties are specified in a manner such as the following:
Line: 677 to 681
 HistoNameHigh="myShapeSystematic_1_high" HistoNameLow="myShapeSystematic_1_low" />
Changed:
<
<
Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
>
>
Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
 
<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>
Line: 698 to 702
 
Changed:
<
<

Guidance in writing the XML configuration files

>
>
Guidance in writing the XML configuration files
 
Changed:
<
<
Histograms
>
>
Histograms
 
Changed:
<
<
Units
>
>
Units
  The units of the input histogram and the luminosity specified in the top-level XML configuration file should be compatible; for example, input histograms might be in pb while the luminosity might be in pb^{-1} or input histograms might be in fb while the luminosity might be in fb^{-1}.
Changed:
<
<
Naming convention
>
>
Naming convention
  The input histogram names are of no special significance, however, it is often preferable to have a good naming convention devised. One might consider the order in which one has information in the XML and aim to have histogram names appear in a similar order when listed alphabetically.
Line: 730 to 734
 
WW Herwig 105987 upward JES systematic histogram WW_Herwig_105987_m110_sys_JES_up
WW Herwig 105987 downward JES systematic histogram WW_Herwig_105987_m110_sys_JES_do
Changed:
<
<
Samples
>
>
Samples
  The sample names are of no special significance, however, it is often preferable to have very short names (e.g., signal1) for the purposes of clarity when, for example, printing the contents of the workspace.
Line: 739 to 743
  The signal(s) are specified as such through the use of the POI tag (in the channel XML configuration files).
Changed:
<
<
Example file: ttH_m110_channel.xml
>
>
Example file: ttH_m110_channel.xml
 
<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>
Line: 788 to 792
 
Changed:
<
<
Example file: ttH_m110_top-level.xml
>
>
Example file: ttH_m110_top-level.xml
 
<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>
Line: 807 to 811
 
Changed:
<
<
Create XML configuration files automatically
>
>
Create XML configuration files automatically
  Once the XML configuration files for a certain mass point are written, these files can be used to automatically create further XML configuration files for all remaining mass points.

If the described naming convention is used (i.e., mass points are prefixed with an "m"), the following shell script might be used to create all required XML configuration files for all mass points.

Changed:
<
<
Example file: createXMLFiles.sh
>
>
Example file: createXMLFiles.sh
 

Line: 856 to 860
  createXMLFiles $originalMass 140 #userSet
Added:
>
>

C++ approach to HistFactory

This approach has not been committed to the RooStats branch yet, though it soon will be. There are a few stylistic changes to be made to the latest code, though verification of the code is complete (changes reproduce exactly results from ROOT 5.32). Code examples will be written in due course.

The C++ approach to HistFactory allows one to interact with HistFactory (creating models etc.) using C++ and Python code. The configuration part of HistFactory is based on a C++ class structure. Since the configuration state of HistFactory is based on C++, one can bypass XML. All functions used in the executable hist2workspace are available for use in C++ code. These C++ classes mirror the structure of the input XML tags. These classes are automatically saved in the output ROOT file along with the workspace. They can then be browsed in CINT or imported in code.

The main advantage of using the C++ approach to HistFactory is that it allows one to easily create many similar models based on an initial template. In the XML approach, one must create many similar XML configuration files and execute hist2workspace many times (though this can be done in an automated manner).

HistFactory class tree structure

  • Measurement
    • std::string POI
    • double Lumi
    • ... etc.
    • std:: vector<Channels>
      • Channel
        • Data
          • TH1* Observed
        • StatErrorConfig
        • std::vector<Samples>
          • Sample
            • TH1* Nominal
            • std::vector<NormFactor>
            • std::vector<OverallSys>
            • std::vector<HistoSys>

HistFactory configuration in C++

Example: creation of the Measurement and a Channel and, thence, creation of the channel samples, including signal and backgrounds

// Create the Measurement and a channel
   std::string myInputFile = "./data/myData.root";
   std::string myChannel1Path = "";
   
   // Create the measurement.
      Measurement myMeasurement("myMeasurement", "myMeasurement");
      myMeasurement.OutputFilePrefix = "./workspaces/myWorkspace";
      myMeasurement.POI = "SigXsecOverSM";
      myMeasurement.constantParams.push_back("alpha_syst1");
      myMeasurement.constantParams.push_back("Lumi");
      myMeasurement.Lumi = 1.0;
      myMeasurement.LumiRelErr = 0.10;
      myMeasurement.ExportOnly = false;
      myMeasurement.BinHigh = 2;
   
   // Create a channel.
      Channel myChannel("myChannel1");
      myChannel.SetData("myData", myInputFile);
      myChannel.SetStatErrorConfig(0.05, "Poisson");

// Create the channel samples, including signal and background.
   // Create the signal sample.
      Sample mySignal("mySignal", "mySignal", myInputFile);
      mySignal.AddOverallSys("mySystematic1", 0.95, 1.05);
      mySignal.AddNormFactor("SigXsecOverSM", 1, 0, 3);
      myChannel.samples.push_back(mySignal);
   // Create the background 1 sample.
      Sample myBackground1("myBackground1", "myBackground1", myInputFile);
      myBackground1.ActivateStatError("myBackground1StatisticalUncertainty", myInputFile);
      myBackground1.AddOverallSys("mySystematic2", 0.95, 1.05);
      myChannel.samples.push_back(myBackground1);
   // Create the background 2 sample.
      Sample myBackground2("myBackground2", "myBackground2", myInputFile);
      myBackground2.ActivateStatError();
      myBackground2.AddOverallSys("mySystematic3", 0.95, 1.05);
      myChannel.samples.push_back(myBackground2);
   // Add the samples to the measurement.
      myMeasurement.channels.push_back(myChannel);
 

Links for HistFactory

HistFactory manual

Line: 868 to 944
  Exotics Working Group statistics tutorial workspace examples
Added:
>
>
Early description of C++ approach to HistFactory
 

Analysis!

ATLAS recommends the use of the profile likelihood as a test statistic.

Revision 392012-02-27 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-01-20
>
>
-- WilliamBreadenMadden - 2012-02-27
 
Line: 753 to 753
 
<-- signal -->
<Sample Name="signal" HistoName="ttH_m110" NormalizeByTheory="False" >
Deleted:
<
<
 
<-- systematics: -->
<HistoSys Name="Lumi" HistoNameHigh="ttH_m110_sys_Lumi_up"
Line: 764 to 763
  HistoNameLow="ttH_m110_sys_JES_do" />
Added:
>
>
 

<-- backgrounds -->
Line: 821 to 821
  # This script is designed for use with the naming conventions described in the following page: # https://ppes8.physics.gla.ac.uk/twiki/bin/view/ATLAS/HiggsAnalysisAtATLASUsingRooStats
Changed:
<
<
# The #userInput hashtag is used to indicate places in the script where the user might modify
>
>
# The #userSet hashtag is used to indicate places in the script where the user might modify
 # things.

createXMLFiles()

Line: 846 to 846
 }

# Specify the original mass point.

Changed:
<
<
originalMass=110 #userInput
>
>
originalMass=110 #userSet
  # Execute the function for creation of new XML configuration files for all required mass points.
Changed:
<
<
createXMLFiles $originalMass 115 #userInput createXMLFiles $originalMass 120 #userInput createXMLFiles $originalMass 125 #userInput createXMLFiles $originalMass 130 #userInput createXMLFiles $originalMass 140 #userInput
>
>
createXMLFiles $originalMass 115 #userSet createXMLFiles $originalMass 120 #userSet createXMLFiles $originalMass 125 #userSet createXMLFiles $originalMass 130 #userSet createXMLFiles $originalMass 140 #userSet
 

Links for HistFactory

Revision 382012-01-20 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-01-18
>
>
-- WilliamBreadenMadden - 2012-01-20
 
Line: 18 to 18
  There are instructions on how to use the different versions of ROOT at Glasgow here.
Changed:
<
<
To check what version of ROOT is running, use the following command:

root -v -b

>
>
Execute the following commands in order to set up ROOT version 5.32.00 on PPELX:

export ROOTSYS=/data/ppe01/sl5x/x86_64/root/5.32.00
export PATH=$ROOTSYS/bin:$PATH
export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH
 
Changed:
<
<
--++ Using ROOT on the CERN LXPLUS network
>
>

Using ROOT on the CERN LXPLUS network

  Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:

Revision 372012-01-18 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-01-13
>
>
-- WilliamBreadenMadden - 2012-01-18
 
Line: 14 to 14
  RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project developes quickly.
Changed:
<
<

Using the appropriate version of ROOT at Glasgow

>
>

Using ROOT on the Glasgow PPELX network

  There are instructions on how to use the different versions of ROOT at Glasgow here.
Line: 23 to 23
  root -v -b

Added:
>
>
--++ Using ROOT on the CERN LXPLUS network

Execute the following commands in order to set up ROOT version 5.32.00 on LXPLUS:

. /afs/cern.ch/sw/lcg/external/gcc/4.3.2/x86_64-slc5/setup.sh
cd /afs/cern.ch/sw/lcg/app/releases/ROOT/5.32.00/x86_64-slc5-gcc43-opt/root/
. bin/thisroot.sh
 

Setting up RooStats

There are three main options available for acquiring ROOT with RooStats included.

Line: 859 to 869
  ATLAS recommends the use of the profile likelihood as a test statistic.
Added:
>
>

Full significance calculation example, from histogram creation to model production (using hist2workspace), to RooStats calculations

Prepare a working area.

Make the main directory.

cd ~
mkdir test
cd test

Make the directory for the XML configuration files.

mkdir config

Make the directory for the input histogram ROOT files.

mkdir data

Make the directory for the workspace.

mkdir workspaces

Generate input histograms.

Change to the directory for the input histograms.

cd data

Run some code such as the following. For fun, we will create a simulated data signal at about 126 GeV at about three times the number of events of that which was expected.

Example file: make_test_histograms.c

{
   // Create the expected signal histogram.
      // Create the function used to describe the signal shape (a simple Gaussian shape).
         TF1 mySignalFunction("mySignalFunction", "(1/sqrt(2*pi*0.5^2))*2.718^(-(x-126)^2/(2*0.5^2))", 120, 130);
      // Create the histogram with 100 bins between 120 and 130 GeV.
         TH1F mySignal("mySignal", "mySignal", 100, 120, 130);
      // Fill the histogram using the signal function.
         mySignal.FillRandom("mySignalFunction", 10000000);
   // Create the background histogram.
      // Create the function used to describe the signal shape (a simple polynomial of the second order).
         TF1 myBackgroundFunction("myBackgroundFunction", "-2.3*10^(-6)*x^2+0.0007*x+0.4", 120, 130);
      // Create the histogram with 100 bins between 120 and 130 GeV.
         TH1F myBackground("myBackground", "myBackground", 100, 120, 130);
      // Fill the histogram using the background function.
         myBackground.FillRandom("myBackgroundFunction", 100000000000);
   // Create the (simulated) data histogram. This histogram represents what one might have as real data.
      // Create the histogram using by combining the signal histogram multiplied by 3 with the background histogram.
         TH1F myData=3*mySignal+myBackground;
      // Set the name of the histogram.
         myData->SetName("myData");
   // Save the histograms created to a ROOT file.
      TFile myFile("test_histograms.root", "RECREATE");
      mySignal->Write();
      myBackground->Write();
      myData->Write();
      myFile.Close();
}

Create the workspace.

There are two approaches which might be used to create the RooFit workspace. The first method uses the hist2workspace program to run on XML configuration files and input histogram files. The second method explicitly builds the workspace using RooFit code. In general, most should opt for the first, XML-based approach. If you wish to understand the process of specifying the specifics of the models you want to create, you might take a look at the second approach.

Option 1: Create the workspace using hist2workspace and XML configuration files.

Change to the directory for the configuration XML files.

cd ~/test/config

Copy the HistFactory XML schema to the configuration directory.

cp $ROOTSYS/etc/HistFactorySchema.dtd .

Create XML configuration files in the configuration directory.

Example file: test_top-level.xml

<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>

<!-- workspace output file prefix -->
<Combination OutputFilePrefix="./workspaces/test_workspace" Mode="comb" >

   <!-- channel XML file(s) -->
   <Input>./config/test_channel.xml</Input>

   <!-- measurement and bin range -->
   <Measurement Name="datastat" Lumi="1" LumiRelErr="0.037" BinLow="0" BinHigh="99" Mode="comb" ExportOnly="True">
      <POI>SigXsecOverSM</POI>
   </Measurement>

</Combination>

Example file: test_channel.xml

<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>

<!-- channel name and input file -->
<Channel Name="test" InputFile="./data/test_histograms.root" HistoName="">
   
   <!-- data -->
   <Data HistoName="myData" />

   <!-- signal -->
   <Sample Name="signal" HistoName="mySignal"
      NormalizeByTheory="True" >
      <NormFactor Name="SigXsecOverSM" Val="1.0" Low="0.0" High="100." Const="True" />
   </Sample>

   <!-- backgrounds -->
   
   <!-- background -->
   <Sample Name="background" NormalizeByTheory="True" HistoName="myBackground">
   </Sample>

</Channel>

As you can see, the model is very simple. There is the expected signal, the expected background and the actual data (in this case, the data is simulated). The point of interest relates to the signal and the expected data will be compared to the actual data.

Run hist2workspace.

hist2workspace config/test_top-level.xml

There should now be created 4 files in the workspaces directory:

test_workspace_combined_datastat_model.root
test_workspace_datastat.root
test_workspace_results.table
test_workspace_test_datastat_model.root

The test_workspace_combined_datastat_model.root file is the ROOT file that contains the combined workspace object (is has constraints, weightings etc. properly incorporated). This is the file you are almost certainly interested in. The test_workspace_test_datastat_model.root file contains a workspace object without proper constraints, weightings etc. I don't know what the other two files are.

Use the ProfileLikelihoodCalculator to calculate the 95% condifence interval on the parameter of interest as specified in the ModelConfig.

Create a C++ program such as the following:

Example file: ProfileLikeliHoodCalculator_confidence_level.cpp

#include <iostream>
#include <fstream>
#include <sstream>
#include <stdio.h>
#include <string.h>
#include <cmath>
#include "TStyle.h"
#include "TROOT.h"
#include "TPluginManager.h"
#include "TSystem.h"
#include "TFile.h"
#include "TGaxis.h"
#include "TCanvas.h"
#include "TH1.h"
#include "TF1.h"
#include "TLine.h"
#include "TSpline.h"
#include "RooAbsData.h"
#include "RooDataHist.h"
#include "RooCategory.h"
#include "RooDataSet.h"
#include "RooRealVar.h"
#include "RooAbsPdf.h"
#include "RooSimultaneous.h"
#include "RooProdPdf.h"
#include "RooNLLVar.h"
#include "RooProfileLL.h"
#include "RooFitResult.h"
#include "RooPlot.h"
#include "RooRandom.h"
#include "RooMinuit.h"
#include "TRandom3.h"
#include "RooWorkspace.h"
#include "RooStats/RooStatsUtils.h"
#include "RooStats/ModelConfig.h"
#include "RooStats/ProfileLikelihoodCalculator.h"
#include "RooStats/LikelihoodInterval.h"
#include "RooStats/LikelihoodIntervalPlot.h"
#include "TStopwatch.h"
using namespace std;
using namespace RooFit;
using namespace RooStats;

int main(){
   // Access the inputs.
      // Open the ROOT workspace file.
         TString myInputFileName = "workspaces/test_workspace_combined_datastat_model.root";
         cout << "Opening file " << myInputFileName << "..." << endl;
         TFile *_file0 = TFile::Open(myInputFileName);
      // Access the workspace.
         cout << "Accessing workspace..." << endl;
         RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
      // Access the ModelConfig
         cout << "Accessing ModelConfig..." << endl;      
         ModelConfig* myModelConfig = (ModelConfig*) myWorkspace->obj("ModelConfig");
      // Access the data.
         cout << "Accessing data..." << endl;
         RooAbsData* myData = myWorkspace->data("obsData");

   // Use the ProfileLikelihoodCalculator to calculate the 95% confidence interval on the parameter of interest as specified in the ModelConfig.
      cout << "Calculating profile likelihood...\n" << endl;
      ProfileLikelihoodCalculator myProfileLikelihood(*myData, *myModelConfig);
      myProfileLikelihood.SetConfidenceLevel(0.95);
      LikelihoodInterval* myConfidenceInterval = myProfileLikelihood.GetInterval();
      // Access the confidence interval on the parameter of interest (POI).
         RooRealVar* myPOI = (RooRealVar*) myModelConfig->GetParametersOfInterest()->first();
   
   // Print the results.
      cout << "Printing results..." << endl;
      // Print the confidence interval on the POI.
         cout << "\n95% confidence interval on the point of interest " << myPOI->GetName()<<": ["<<
            myConfidenceInterval->LowerLimit(*myPOI) << ", "<<
            myConfidenceInterval->UpperLimit(*myPOI) <<"]\n"<<endl;
   return 0;
  }
  

Compile this code using a Makefile such as the following:

Example file: Makefile

ProfileLikeliHoodCalculator_confidence_level.cpp : ProfileLikeliHoodCalculator_confidence_level.cpp
   g++ -g -O2 -fPIC -Wno-deprecated  -o ProfileLikeliHoodCalculator_confidence_level.cpp ProfileLikeliHoodCalculator_confidence_level.cpp `root-config --cflags --libs --ldflags` -lHistFactory -lXMLParser -lRooStats -lRooFit -lRooFitCore -lThread -lMinuit -lFoam -lHtml -lMathMore -I$ROOTSYS/include -L$ROOTSYS/lib

Compile the code.

make

Results

The end result as displayed in the terminal output is the following:

95% confidence interval on the point of interest SigXsecOverSM: [2.99653, 3.00347]
 

Further information

Links for ROOT

Revision 362012-01-13 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2012-01-12
>
>
-- WilliamBreadenMadden - 2012-01-13
 
Line: 512 to 512
 

General description

Changed:
<
<
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML files are set up for those input ROOT files. The XML files specify details on the histograms and specify how RooFit should interpret the information in the files. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace appropriately.
>
>
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML configuration files are set up for those input ROOT files. The XML configuration files specify details on the histograms, how RooFit should interpret the information in the files and how histograms are to be used in calculations. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace.
 

prepareHistFactory

Changed:
<
<
The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates results/, data/ and config/ directories. It then copies the HistFactorySchema.dtd and example XML files from the ROOT tutorials directory into the config/ directory. It also copies a ROOT file into the data/ directory for use with the example XML configuration files.
>
>
The ROOT release ships with a script called prepareHistFactory in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates results/, data/ and config/ directories. It then copies the HistFactorySchema.dtd and example XML files from the ROOT tutorials directory into the config/ directory. It also copies a ROOT file into the data/ directory for use with the example XML configuration files.
 

HistFactorySchema.dtd

Changed:
<
<
This file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
>
>
The HistFactorySchema.dtd file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
 

hist2workspace

Line: 589 to 589
  Val allows for the specification of the specific value.
Changed:
<
<
Const: this has a specific value, not set here, but where the param is defined
>
>
Const: this has a specific value, not set here, but where the param is defined.
 
Specific instructions
Line: 697 to 697
  The input histogram names are of no special significance, however, it is often preferable to have a good naming convention devised. One might consider the order in which one has information in the XML and aim to have histogram names appear in a similar order when listed alphabetically.
Changed:
<
<
Generic example:
>
>
Generic example
 
Type of histogram Histogram naming convention
Phenomenon histogram <phenomenon name>_m<mass point>
Phenomenon upward systematic histogram <phenomenon name>_m<mass point>_sys_<systematic name>_up
Line: 706 to 705
  The reason for having an "m" character preceding the mass point is that it allows for easy search and replace of the mass point value when automatically producing multiple XML configuration files corresponding to multiple mass points.

Changed:
<
<
Specific example:
>
>
Specific example
 
Type of histogram Histogram name
ttH histogram ttH_m110
ttH upward luminosity systematic histogram ttH_m110_sys_Lumi_up
Line: 719 to 717
 
WW Herwig 105987 upward JES systematic histogram WW_Herwig_105987_m110_sys_JES_up
WW Herwig 105987 downward JES systematic histogram WW_Herwig_105987_m110_sys_JES_do
Changed:
<
<
Samples
>
>
Samples
  The sample names are of no special significance, however, it is often preferable to have very short names (e.g., signal1) for the purposes of clarity when, for example, printing the contents of the workspace.

The NormalizedByTheory attribute should be "True" (as opposed to "TRUE" or "true") for all non-data-driven backgrounds. If the Data tag is removed, expected data shall be used in calculations.

Changed:
<
<
The signal(s) are specified as such through the use of the POI tag.
>
>
The signal(s) are specified as such through the use of the POI tag (in the channel XML configuration files).
 
Example file: ttH_m110_channel.xml
Line: 796 to 794
 
Added:
>
>
Create XML configuration files automatically

Once the XML configuration files for a certain mass point are written, these files can be used to automatically create further XML configuration files for all remaining mass points.

If the described naming convention is used (i.e., mass points are prefixed with an "m"), the following shell script might be used to create all required XML configuration files for all mass points.

Example file: createXMLFiles.sh

#!/bin/bash

# This script is designed for use with the naming conventions described in the following page:
# https://ppes8.physics.gla.ac.uk/twiki/bin/view/ATLAS/HiggsAnalysisAtATLASUsingRooStats
# The #userInput hashtag is used to indicate places in the script where the user might modify
# things.

createXMLFiles()
# Arguments: originalMass
{
   originalMass=$1
   newMass=$2
   # Set/specify the original XML configuration file names.
      originalChannelXMLFileName=ttH_m${originalMass}_channel.xml
      originalTopLevelXMLFileName=ttH_m${originalMass}_top-level.xml
   # Set/specify the new XML configuration file names.
      newChannelXMLFileName=ttH_m${newMass}_channel.xml
      newTopLevelXMLFileName=ttH_m${newMass}_top-level.xml
   # Duplicate the original XML configuration files while substituting the original mass
   # points with the new mass points in the new resulting file.
      # Here, the program sed is used to replace all occurrances of a specified pattern in
      # a specified file with another specified pattern.
      # The "s" correspondes to "substitute".
      # The "g" corresponds to "globally" (all instances in a line).
      sed "s/m$originalMass/m$newMass/g" $originalChannelXMLFileName > $newChannelXMLFileName
      sed "s/m$originalMass/m$newMass/g" $originalTopLevelXMLFileName > $newTopLevelXMLFileName
}

# Specify the original mass point.
   originalMass=110 #userInput

# Execute the function for creation of new XML configuration files for all required mass points. 
   createXMLFiles $originalMass 115 #userInput
   createXMLFiles $originalMass 120 #userInput
   createXMLFiles $originalMass 125 #userInput
   createXMLFiles $originalMass 130 #userInput
   createXMLFiles $originalMass 140 #userInput
 

Links for HistFactory

HistFactory manual

Revision 352012-01-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2012-01-12
Line: 414 to 414
  myFrame->Draw()

Added:
>
>

Links for RooFit

User's Manual

Tutorials

 

RooStats

General description

Line: 486 to 492
 

Added:
>
>

Links for RooStats

Wiki

RooStats User's Guide

Tutorials

E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch

E-mail support for software issues, bugs etc.: roostats-development@cern.ch

 

ModelConfig

The ModelConfig RooStats class encapsulates the configuration of a model to define a particular hypothesis. It is now used extensively by the calculator tools. ModelConfig always contains a reference to an external workspace that manages all of the objects that are a part of the model (PDFs and parameter sets). So, in order to use ModelConfig, the user must specify a workspace pointer before creating the various objects of the model.

Line: 667 to 685
 
Changed:
<
<

Analysis!

>
>

Guidance in writing the XML configuration files

 
Changed:
<
<
ATLAS recommends the use of the profile likelihood as a test statistic.
>
>
Histograms
 
Changed:
<
<

Further information

>
>
Units
 
Changed:
<
<

ROOT links:

>
>
The units of the input histogram and the luminosity specified in the top-level XML configuration file should be compatible; for example, input histograms might be in pb while the luminosity might be in pb^{-1} or input histograms might be in fb while the luminosity might be in fb^{-1}.
 
Changed:
<
<
ROOT User's Guide
>
>
Naming convention
 
Changed:
<
<

RooFit links

>
>
The input histogram names are of no special significance, however, it is often preferable to have a good naming convention devised. One might consider the order in which one has information in the XML and aim to have histogram names appear in a similar order when listed alphabetically.
 
Changed:
<
<
User's Manual
>
>
Generic example:
 
Changed:
<
<
Tutorials
>
>
Type of histogram Histogram naming convention
Phenomenon histogram <phenomenon name>_m<mass point>
Phenomenon upward systematic histogram <phenomenon name>_m<mass point>_sys_<systematic name>_up
Phenomenon downward systematic histogram <phenomenon name>_m<mass point>_sys_<systematic name>_do
 
Changed:
<
<

RooStats links

>
>
The reason for having an "m" character preceding the mass point is that it allows for easy search and replace of the mass point value when automatically producing multiple XML configuration files corresponding to multiple mass points.
 
Changed:
<
<
Wiki
>
>
Specific example:
 
Changed:
<
<
RooStats User's Guide
>
>
Type of histogram Histogram name
ttH histogram ttH_m110
ttH upward luminosity systematic histogram ttH_m110_sys_Lumi_up
ttH downward luminosity systematic histogram ttH_m110_sys_Lumi_do
ttH upward JES systematic histogram ttH_m110_sys_JES_up
ttH downward JES systematic histogram ttH_m110_sys_JES_do
WW Herwig 105987 upward luminosity systematic histogram WW_Herwig_105987_m110_sys_Lumi_up
WW Herwig 105987 downward luminosity systematic histogram WW_Herwig_105987_m110_sys_Lumi_do
WW Herwig 105987 upward JES systematic histogram WW_Herwig_105987_m110_sys_JES_up
WW Herwig 105987 downward JES systematic histogram WW_Herwig_105987_m110_sys_JES_do
 
Changed:
<
<
Tutorials
>
>
Samples
 
Changed:
<
<
HistFactory manual
>
>
The sample names are of no special significance, however, it is often preferable to have very short names (e.g., signal1) for the purposes of clarity when, for example, printing the contents of the workspace.
 
Changed:
<
<
E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch
>
>
The NormalizedByTheory attribute should be "True" (as opposed to "TRUE" or "true") for all non-data-driven backgrounds. If the Data tag is removed, expected data shall be used in calculations.
 
Changed:
<
<
E-mail support for software issues, bugs etc.: roostats-development@cern.ch
>
>
The signal(s) are specified as such through the use of the POI tag.

Example file: ttH_m110_channel.xml

<!DOCTYPE Channel SYSTEM 'HistFactorySchema.dtd'>

<!-- channel name and input file -->
<Channel Name="ttH_m110" InputFile="data/ttH_histograms.root" HistoName="">
 
Changed:
<
<

Other links

>
>
<-- data -->
HistoName="data" />

<-- signal -->
<Sample Name="signal" HistoName="ttH_m110" NormalizeByTheory="False" >
<-- systematics: -->
<HistoSys Name="Lumi" HistoNameHigh="ttH_m110_sys_Lumi_up" HistoNameLow="ttH_m110_sys_Lumi_do" /> <HistoSys Name="JES" HistoNameHigh="ttH_m110_sys_JES_up" HistoNameLow="ttH_m110_sys_JES_do" />

<-- backgrounds -->

<-- WW_Herwig_105987 -->
NormalizeByTheory="True" HistoName="WW_Herwig_105987_m110">

<-- Wplusjets -->
NormalizeByTheory="False" HistoName="Wplusjets_m110">
<-- systematics -->
<HistoSys Name="Lumi" HistoNameHigh="Wplusjets_m110_sys_Lumi_up" HistoNameLow="Wplusjets_m110_sys_Lumi_do" /> <HistoSys Name="JES" HistoNameHigh="Wplusjets_m110_sys_JES_up" HistoNameLow="Wplusjets_m110_sys_JES_do" />

Example file: ttH_m110_top-level.xml

<!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>

<!-- workspace output file prefix -->
<Combination OutputFilePrefix="workspaces/ttH_m110_workspace" Mode="comb" >

   <!-- channel XML file(s) -->
   <Input>config/ttH_m110_channel.xml</Input>

   <!-- measurement and bin range -->
   <Measurement Name="datastat" Lumi="1" LumiRelErr="0.037" BinLow="0" BinHigh="21" Mode="comb" ExportOnly="True">
      <POI>SigXsecOverSM</POI>
   </Measurement>

</Combination>

Links for HistFactory

HistFactory manual

HistFactory XML reference

XML example

  Exotics Working Group statistics tutorial XML reference

Exotics Working Group statistics tutorial workspace examples

Changed:
<
<
XML example
>
>

Analysis!

 
Changed:
<
<
HistFactory XML reference
>
>
ATLAS recommends the use of the profile likelihood as a test statistic.

Further information

Links for ROOT

ROOT User's Guide

 

Revision 342012-01-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-12-19
>
>
-- WilliamBreadenMadden - 2012-01-12
 
Line: 498 to 498
 

prepareHistFactory

Changed:
<
<
The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates the results/, data/ and config/ directories. It copies the HistFactorySchema.dtd and example XML files into the config/ directory. Also, it copies a ROOT file into the data/ directory for use with the examples.
>
>
The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates results/, data/ and config/ directories. It then copies the HistFactorySchema.dtd and example XML files from the ROOT tutorials directory into the config/ directory. It also copies a ROOT file into the data/ directory for use with the example XML configuration files.
 

HistFactorySchema.dtd

Changed:
<
<
HistFactorySchema.dtd: This file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
>
>
This file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
 

hist2workspace

Changed:
<
<
The hist2workspace executable is used in the following manner:
>
>
The hist2workspace executable is run using the top-level XML configuration file as an argument in the following manner:
 
Changed:
<
<
hist2workspace input.xml
>
>
hist2workspace top_level.xml
 

XML files

General description

Changed:
<
<
A minimum of two XML files are required for configuration. The "top-level" XML file defines the measurement and contains a list of channels that contribute to this measurement. "channel" XML files are used to describe each channel in detail. For each contributing channel, there is a separate XML file.
>
>
A minimum of two XML files are required for configuration. The "top-level" XML configuration file defines the measurement and contains a list of channels that contribute to this measurement. The "channel" XML configuration files are used to describe each channel in detail. For each contributing channel, there is a separate XML configuration file.
 

Conventions

Changed:
<
<
The nominal and variational histograms should all have the same normalization convention. There are a few conventions possible:
>
>
The nominal and variational histograms should all have the same normalisation convention. There are a few conventions possible:
 
  • Option 1:
Changed:
<
<
    • Lumi="XXX" in thee main XML's element, where XX is in fb^-1
    • Histograms are in fb / bin
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)
>
>
    • Lumi="XXX" is in the main XML's element, where XXX is in fb^-1.
    • Histograms have units of fb/bin.
    • Some samples have NormFactors that are all relative to prediction (e.g. 1 is the nominal prediction).
 
  • Option 2:
Changed:
<
<
    • Lumi="1" in thee main XML's element
    • Histograms are normalized to unity
    • each sample has a NormFactor that is the expected numbers of events in data
>
>
    • Lumi="1" is in the main XML's element.
    • Histograms are normalized to unity.
    • Each sample has a NormFactor that is the expected numbers of events in data.
 
  • Option 3:
Changed:
<
<
    • Lumi="1" in thee main XML's element
    • Histograms are in numbers of events / bin expected in data
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)
    • It's up to you. In the end, the expected number is the product of them: N=Lumi*BinContent*NormFactor(s) See the PDF user's guide for more precise equations.
>
>
    • Lumi="1" is in the main XML's element.
    • Histograms have units of (numbers of events)/bin expected in data.
    • Some samples have NormFactors that are all relative to prediction (e.g. 1 is the nominal prediction).
    • It's up to you. In the end, the expected number is a product: N=Lumi*BinContent*NormFactor(s). See the user's guide for more precise equations.
 

Top-Level XML file

General description

This file specifies a top level 'Combination' that is composed of:

Changed:
<
<
  • several 'Channels', which are described in separate XML files.
>
>
  • several 'Channels', which are described in separate XML configuration files.
 
  • several 'Measurements' (corresponding to a full fit of the model), each of which specifies
    • Name, a name for this measurement to be used in tables and files.
Changed:
<
<
    • Lumi, the integrated luminosity associated with the measurement in picobarns
>
>
    • Lumi, the integrated luminosity associated with the measurement in picobarns.
 
    • LumiRelErr, the relative error of the luminosity measurement.
Changed:
<
<
    • which bins of the histogram should be used.
>
>
    • the histogram bins to be used:
 
      • BinLow specifies the lowest bin number used for the measurement (inclusive).
      • BinHigh specifies the highest bin number used for the measurement (exclusive).
Changed:
<
<
    • what the relative uncertainty on the luminosity is.
>
>
    • relative uncertainty on the luminosity.
 
    • what is(/are) the parameter(/s) of interest that will be measured.
      • Use POI to specify this.
    • which parameter(/s) should be fixed/floating (e.g., nuisance parameters)
Line: 561 to 561
 
    • whether the tool should export the model only and skip the default fit.
Changed:
<
<
      • ExportOnly: if "True", skip the fit (only export the model; don't perform the initial fit).
>
>
      • ExportOnly: if "True", skip the fit (export only the model; don't perform the initial fit).
  OutputFilePrefix is a prefix for the output ROOT file to be created.
Added:
>
>
 Mode represents the type of analysis. Use "comb".

ParamSetting allows for the specification of which parameters are fixed. If a parameter is included here, it is neither a nuisance parameter nor a POI, but a fixed parameter of the mode.

Revision 332011-12-19 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-12-06
>
>
-- WilliamBreadenMadden - 2011-12-19
 
Line: 349 to 349
 void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")

Changed:
<
<
If the worspace(s) were made using hist2workspace, the names have a standard form (as shown above).
>
>
If the workspace(s) were made using hist2workspace, the names have a standard form (as shown above).
 
Using the Model Inspector

Revision 322011-12-06 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-12-06
Line: 545 to 545
 This file specifies a top level 'Combination' that is composed of:
  • several 'Channels', which are described in separate XML files.
  • several 'Measurements' (corresponding to a full fit of the model), each of which specifies
Changed:
<
<
    • a name for this measurement to be used in tables and files.
    • the luminosity associated with the measurement in picobarns.
>
>
    • Name, a name for this measurement to be used in tables and files.
    • Lumi, the integrated luminosity associated with the measurement in picobarns
    • LumiRelErr, the relative error of the luminosity measurement.
 
    • which bins of the histogram should be used.
Added:
>
>
      • BinLow specifies the lowest bin number used for the measurement (inclusive).
      • BinHigh specifies the highest bin number used for the measurement (exclusive).
 
    • what the relative uncertainty on the luminosity is.
    • what is(/are) the parameter(/s) of interest that will be measured.
Added:
>
>
      • Use POI to specify this.
 
    • which parameter(/s) should be fixed/floating (e.g., nuisance parameters)
    • which type of constriants are desired:
Added:
>
>
 
      • default: Gaussian
Changed:
<
<
>
>
 
    • whether the tool should export the model only and skip the default fit.
Added:
>
>
      • ExportOnly: if "True", skip the fit (only export the model; don't perform the initial fit).

OutputFilePrefix is a prefix for the output ROOT file to be created. Mode represents the type of analysis. Use "comb".

ParamSetting allows for the specification of which parameters are fixed. If a parameter is included here, it is neither a nuisance parameter nor a POI, but a fixed parameter of the mode. Val allows for the specification of the specific value. Const: this has a specific value, not set here, but where the param is defined

 
Specific instructions
Line: 601 to 614
 
General description

These files specify for each channel

Added:
>
>
  • Name, a name for the channel.
  • InputFile, the input file in which the histogram can be found. If this is not specified, it must be specified for each sample and data.
  • HistoPath, the path (within the ROOT file) at which the histogram can be found.
  • HistoName (optional), the name of the histogram to be used, unless overridden for specific samples and data.
 
  • observed data (if absent, the tool will use the expectation, which is useful for expected sensitivity)
  • several 'Samples' (e.g., signal, bkg1, bkg2 etc.), each of which specifies
Changed:
<
<
    • a name.
>
>
    • Name, a name
 
    • whether the sample is normalized by theory (e.g., N = L * sigma) or whether the sample is data driven.
    • a nominal expectation histogram.
    • a named 'Normalization Factor' (which can be fixed or allowed to float in a fit).
Line: 685 to 702
  XML example
Added:
>
>
HistFactory XML reference
 

-- WilliamBreadenMadden - 2010-10-29 \ No newline at end of file

Revision 312011-12-06 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-12-02
>
>
-- WilliamBreadenMadden - 2011-12-06
 
Line: 518 to 518
  A minimum of two XML files are required for configuration. The "top-level" XML file defines the measurement and contains a list of channels that contribute to this measurement. "channel" XML files are used to describe each channel in detail. For each contributing channel, there is a separate XML file.
Added:
>
>

Conventions

The nominal and variational histograms should all have the same normalization convention. There are a few conventions possible:

  • Option 1:
    • Lumi="XXX" in thee main XML's element, where XX is in fb^-1
    • Histograms are in fb / bin
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)

  • Option 2:
    • Lumi="1" in thee main XML's element
    • Histograms are normalized to unity
    • each sample has a NormFactor that is the expected numbers of events in data

  • Option 3:
    • Lumi="1" in thee main XML's element
    • Histograms are in numbers of events / bin expected in data
    • Some samples have NormFactors that are all relative to prediction (eg. 1 is the nominal prediction)
    • It's up to you. In the end, the expected number is the product of them: N=Lumi*BinContent*NormFactor(s) See the PDF user's guide for more precise equations.
 

Top-Level XML file

General description

Revision 302011-12-02 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-12-02
Line: 170 to 168
 Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1): gauss.getVal(x); What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1):
Deleted:
<
<
 

Datasets

General description

Changed:
<
<
In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
>
>
A dataset is a collection of points in N-dimensional space. In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
 
Changed:
<
<
In general, working in RooFit with binned and unbinned data is very similar, as both class RooDataSet (for unbinned data) and class RooDataHist (for binned data) inherit from a common base class, RooAbsData, which defines the interface for a generic abstract data sample. With few exceptions, all RooFit methods take abstract datasets as input arguments, allowing for the interchangeable use of binned and unbinned data.
>
>
In general, working in RooFit with binned and unbinned data is very similar, as both the RooDataSet (for unbinned data) and RooDataHist (for binned data) classes inherit from a common base class, RooAbsData, which defines the interface for a generic abstract data sample. With few exceptions, all RooFit methods take abstract datasets as input arguments, allowing for the interchangeable use of binned and unbinned data.
 

RooDataSet (unbinned data)

Line: 193 to 190
  myData.plotOn()(myFrame); myFrame.Draw(); }
Deleted:
<
<
 

Changed:
<
<
Plotting unbinned data is similar to plotting binned data with the exception that one can show it in some preferred binning.
>
>
Plotting unbinned data is similar to plotting binned data with the exception that one can display it in some preferred binning.
 
Example code: plotting unbinned data (a RooDataSet) using a specified binning

Line: 203 to 199
 RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ; myFrame->Draw()
Deleted:
<
<
 

Importing data from ROOT trees (how to populate RooDataSets from TTrees)
Changed:
<
<
* Put stuff here, Will. *
>
>
* to be completed *
 

RooDataHist (binned data)

Line: 263 to 256
  The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
Changed:
<
<
One might create a Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian shall be drawn in and owned by the workspace. There are no nightmarish ownership problems. Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".
>
>
Consider a Gaussian PDF. One might create this Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian would be drawn in and owned by the workspace (there are no nightmarish ownership problems). Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".
 

Example code: using the Workspace Factory to create a Gaussian PDF

Line: 285 to 277
  myWorkspace.Print(); // Example printout:
Changed:
<
<
variables --------- (x,m,s)

p.d.f.s ------- RooGaussian::g[ x=x mean=m sigma=s ] = 0

datasets -------- RooDataSet::d(x)

>
>
// variables // --------- // (x,m,s) // // p.d.f.s // ------- // RooGaussian::g[ x=x mean=m sigma=s ] = 0 // // datasets // -------- // RooDataSet::d(x)
  // Import the variable saved as x. RooRealVar* myVariable = myWorkspace.var("x");
Line: 349 to 340
  // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5
Deleted:
<
<
 

Model Inspector

The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:

Changed:
<
<
>
>


 void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")
Added:
>
>

 
Changed:
<
<
If the worspace(/s) were made using hist2workspace, the names have a standard form (as shown above).
>
>
If the worspace(s) were made using hist2workspace, the names have a standard form (as shown above).
 
Using the Model Inspector
Line: 505 to 494
 

General description

Changed:
<
<
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML files are set up for those input ROOT files. The XML files specify details on the histograms and specify how RooFit should interpret the information in the files. An little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace appropriately.
>
>
The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML files are set up for those input ROOT files. The XML files specify details on the histograms and specify how RooFit should interpret the information in the files. A little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace appropriately.
 

prepareHistFactory

Changed:
<
<
The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directories. prepareHistFactory prepares a working area. It creates results/, data/ and config/ directories. It copies the HistFactorySchema.dtd and example XML files into the config/ directory. Also, it copies a ROOT file into the data/ directory for use with the examples.
>
>
The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directory. The prepareHistFactory script prepares a working area. Specifically, it creates the results/, data/ and config/ directories. It copies the HistFactorySchema.dtd and example XML files into the config/ directory. Also, it copies a ROOT file into the data/ directory for use with the examples.
 

HistFactorySchema.dtd

Changed:
<
<
HistFactorySchema.dtd: This file is located in $ROOTSYS/etc/ specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
>
>
HistFactorySchema.dtd: This file, located in $ROOTSYS/etc/, specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
 

hist2workspace

Changed:
<
<
The hist2workspace executable is used in the following manner: hist2workspace input.xml
>
>
The hist2workspace executable is used in the following manner:

   hist2workspace input.xml
 

XML files

Revision 292011-12-02 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-10-04
>
>
-- WilliamBreadenMadden - 2011-12-02
 

Higgs analysis at ATLAS using RooStats

Deleted:
<
<
* THIS PAGE IS UNDER CONSTRUCTION *
 This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.

Deleted:
<
<

A note on code and formatting

Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a classical green-text-on-black-background console style. Scripts are given grey highlighting. File contents are given the golden verbatim colouring. So, in a nutshell, code segments are in yellow, full ROOT scripts are in terminal, general scripts are in grey and file contents are in gold. In code examples given, user-created objects are generally prefaced with "my", for example, "myData", for the purposes of clarity.

 

What is RooStats?

Changed:
<
<
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project is developing quickly.
>
>
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project developes quickly.
 

Using the appropriate version of ROOT at Glasgow

There are instructions on how to use the different versions of ROOT at Glasgow here.

To check what version of ROOT is running, use the following command:

Changed:
<
<
root -v -b
>
>

root -v -b

 

Setting up RooStats

Line: 51 to 46
 # Next, the latest version of ROOT in the CERN Subversion repository is checked out. # Finally, ROOT is compiled.
Changed:
<
<
# Install the ROOT prerequisites.
>
>
# Install ROOT prerequisites.
  sudo apt-get install subversion sudo apt-get install make sudo apt-get install g++
Line: 62 to 57
  sudo apt-get install libxft-dev sudo apt-get install libxext-dev
Changed:
<
<
# Install the optional ROOT packages.
>
>
# Install optional ROOT packages.
  sudo apt-get install gfortran sudo apt-get install ncurses-dev sudo apt-get install libpcre3-dev
Line: 80 to 75
  sudo apt-get install libssl-dev sudo apt-get install libgsl0-dev
Changed:
<
<
# Check out latest ROOT trunk. svn co http://root.cern.ch/svn/root/trunk ~/root
>
>
# Check out the latest ROOT trunk. svn co http://root.cern.ch/svn/root/trunk /usr/local/root
 
Changed:
<
<
# The configuration for the build is set. cd ~/root # Run this to define the system architecture and to enable building of the libRooFit advanced fitting package:
>
>
# Configure the build. cd /usr/local/root # Configure for the system architecture and configure to build the libRooFit advanced fitting package as part of the compilation.
  ./configure linuxx8664gcc --enable-roofit # See other possible configurations using the following command: ./configure --help
Changed:
<
<
# Start compiling.
>
>
# Compile.
  make
Added:
>
>

 
Changed:
<
<
# Upon completion, ROOT is run by executing ~/root/bin/root.

# The following line could be added to the ~/.bashrc file: # export PATH=$PATH:/home/wbm/root/bin

>
>
On a MacBook Pro 7, 1 running Ubuntu 11.04, the compilation takes ~ 1 hour. Following the compilation, the ROOT environment variables can be set up. In Ubuntu or Scientific Linux, the following lines could be added to the ~/.bashrc file:

   export ROOTSYS=/usr/local/root
   export PATH=$ROOTSYS/bin:$PATH
   export LD_LIBRARY_PATH=$ROOTSYS/lib/root:$LD_LIBRARY_PATH

 

Option 3: Build the RooStats branch.

Line: 111 to 107
  The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.
Changed:
<
<
The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions (PDFs), F(x;p), that describe the probability density of the distribution of observables x in terms of the function parameter p.
>
>
The core functionality of RooFit is to enable the modelling of 'event data' distributions, in which each event is a discrete occurrence in time and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions is probability density functions (PDFs), F(x;p), that describe the probability density of the distribution of observables x in terms of the function parameter p.
 
Changed:
<
<
In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identified for the object while the title of an object is a more elaborate description of the object.
>
>
In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identifier for the object while the title of an object is a more elaborate description of the object.
  Here are a few examples of mathematical concepts that correspond to various RooFit classes:

Revision 282011-10-04 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-10-04
Line: 357 to 357
 

Added:
>
>
Model Inspector

The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:

void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")

If the worspace(/s) were made using hist2workspace, the names have a standard form (as shown above).

Using the Model Inspector

// Load the macro.
root -L ModelInspector.C++
// Run the macro on the appropriate ROOT workspace file.
ModelInspector("results/my_combined_example_model.root")

The Model Inspector GUI should appear. The GUI consists of a number of plots, corresponding to the various channels in the model, and a few sliders, corresponding to the parameters of interest and the nuisance parameters in the model. The initial plots are based on the values of the parameters in the workspace. There is a little "Fit" button which fits the model to the data points (while also printing the standard terminal output detailing the fitting). After fitting, a yellow band is shown around the best fit model indicating the uncertainty from propagating the uncertainty of the fit through the model. On the plots, there is a red line shown (corresponding to the fit for each of the parameters at their nominal value pushed up by 1 sigma).

How do I get it?

You can get the ModelInspector.C from the Statistics Forum RooStats tools here.

Links

Video illustrating the usage of the Model Inspector

 

Accessing the RooFit workspace

Example code: accessing the workspace
Line: 403 to 430
 

Deleted:
<
<
Model Inspector

The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:

void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")

If the worspace(/s) were made using hist2workspace, the names have a standard form (as shown above).

Using the Model Inspector

// Load the macro.
root -L ModelInspector.C++
// Run the macro on the appropriate ROOT workspace file.
ModelInspector("results/my_combined_example_model.root")

The Model Inspector GUI should appear. The GUI consists of a number of plots, corresponding to the various channels in the model, and a few sliders, corresponding to the parameters of interest and the nuisance parameters in the model. The initial plots are based on the values of the parameters in the workspace. There is a little "Fit" button which fits the model to the data points (while also printing the standard terminal output detailing the fitting). After fitting, a yellow band is shown around the best fit model indicating the uncertainty from propagating the uncertainty of the fit through the model. On the plots, there is a red line shown (corresponding to the fit for each of the parameters at their nominal value pushed up by 1 sigma).

How do I get it?

You can get the ModelInspector.C from the Statistics Forum RooStats tools here.

Links

Video illustrating the usage of the Model Inspector

 

RooStats

General description

Revision 272011-10-04 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-09-13
>
>
-- WilliamBreadenMadden - 2011-10-04
 
Line: 403 to 403
 

Added:
>
>
Model Inspector

The Model Inspector is a GUI for examining the model contained in the RooFit workspace. The function and it's parameters are as follows:

void ModelInspector(const char* infile = "", const char* workspaceName = "combined", const char* modelConfigName = "ModelConfig", const char* dataName = "obsData")

If the worspace(/s) were made using hist2workspace, the names have a standard form (as shown above).

Using the Model Inspector

// Load the macro.
root -L ModelInspector.C++
// Run the macro on the appropriate ROOT workspace file.
ModelInspector("results/my_combined_example_model.root")

The Model Inspector GUI should appear. The GUI consists of a number of plots, corresponding to the various channels in the model, and a few sliders, corresponding to the parameters of interest and the nuisance parameters in the model. The initial plots are based on the values of the parameters in the workspace. There is a little "Fit" button which fits the model to the data points (while also printing the standard terminal output detailing the fitting). After fitting, a yellow band is shown around the best fit model indicating the uncertainty from propagating the uncertainty of the fit through the model. On the plots, there is a red line shown (corresponding to the fit for each of the parameters at their nominal value pushed up by 1 sigma).

How do I get it?

You can get the ModelInspector.C from the Statistics Forum RooStats tools here.

Links

Video illustrating the usage of the Model Inspector

 

RooStats

General description

Revision 262011-09-13 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-28
>
>
-- WilliamBreadenMadden - 2011-09-13
 
Line: 26 to 26
  To check what version of ROOT is running, use the following command:
Changed:
<
<
root -v -b
>
>
root -v -b
 

Setting up RooStats

Line: 42 to 40
  Follow the appropriate instructions here to build the ROOT trunk.
Deleted:
<
<
 

Shell script: building ROOT with RooFit and RooStats

Changed:
<
<

>
>

 #!/bin/bash

# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.

Line: 100 to 97
 # The following line could be added to the ~/.bashrc file: # export PATH=$PATH:/home/wbm/root/bin
Changed:
<
<

>
>

 

Option 3: Build the RooStats branch.

Line: 130 to 127
  Composite functions correspond to composite objects. The ArgSet class is dependent on argument order while the ArgList class is not.
Deleted:
<
<
 

Example code: defining a RooFit variable

Changed:
<
<

>
>

 General form for defining a RooFit variable: RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>) Specific example for defining a RooFit variable x with the value 5: RooRealVar x("x", "x observable", 5, -10, 10)
Changed:
<
<

>
>

 

RooPlot

Line: 154 to 150
  Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. A Jacobian factor is picked up in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
Deleted:
<
<
 

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

Changed:
<
<

>
>

 { // Build a Gaussian PDF. RooRealVar x("x", "x", -10, 10);
Line: 168 to 163
  // Plot the PDF. RooPlot* xframe = x.frame(); gauss.plotOn(xframe);
Changed:
<
<
xframe->Draw();
>
>
xframe->Draw();
 }
Changed:
<
<

>
>

 
Deleted:
<
<
 

Example code: telling a RooFit PDF what to normalise over

Changed:
<
<

>
>

 Not normalised (i.e., this is not a PDF): gauss.getVal(); Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1): gauss.getVal(x); What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1):
Changed:
<
<

>
>

 

Datasets

Line: 194 to 188
 

RooDataSet (unbinned data)

Deleted:
<
<
 
Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it
Changed:
<
<

>
>

 } // Create a RooDataSet and fill it with generated toy Monte Carlo data: RooDataSet* myData = gauss.generate(x, 100);
Line: 206 to 199
  myFrame.Draw(); }
Changed:
<
<

>
>

  Plotting unbinned data is similar to plotting binned data with the exception that one can show it in some preferred binning.
Deleted:
<
<
 
Example code: plotting unbinned data (a RooDataSet) using a specified binning
Changed:
<
<

>
>

 RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ;
Changed:
<
<
myFrame->Draw()
>
>
myFrame->Draw()
 
Changed:
<
<

>
>

 
Importing data from ROOT trees (how to populate RooDataSets from TTrees)
Line: 231 to 223
  In displaying the data, RooFit, by default, shows the 68% confidence interval for Poisson statistics.
Deleted:
<
<
 
Example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)
Changed:
<
<

>
>

 { // Access the file. TFile* myFile = new TFile("myFile.root"); // Load the histogram.
Changed:
<
<
TH1* myHistogram = (TH1*) myFile->Get("myHistogram");
>
>
TH1* myHistogram = (TH1*) myFile->Get("myHistogram");
  // Draw the loaded histogram. myHistogram.Draw();
Line: 254 to 245
  myFrame.Draw() }
Changed:
<
<

>
>

 

Fitting

Fitting a model to data

Changed:
<
<
Fitting a model to data can be done in many ways. The most common methods are the χ2 fit and the -log(L) fit. The default fitting method in ROOT is the χ2 method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.
>
>
Fitting a model to data can be done in many ways. The most common methods are the χ2 fit and the -log(L) fit. The default fitting method in ROOT is the χ2 method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.
 

Fitting a PDF to unbinned data

Deleted:
<
<
 
Example code: fit a Gaussian PDF to data
Changed:
<
<

>
>

 // Fit gauss to unbinned data gauss.fitTo(*myData);
Changed:
<
<

>
>

 

The RooFit workspace

Line: 280 to 270
  One might create a Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian shall be drawn in and owned by the workspace. There are no nightmarish ownership problems. Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".
Deleted:
<
<
 

Example code: using the Workspace Factory to create a Gaussian PDF

Changed:
<
<

>
>

 // Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace");
Changed:
<
<
myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
>
>
myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
 
Changed:
<
<

>
>

 

What's in the RooFit workspace?

Deleted:
<
<
 

Example code: What's in the workspace?

Changed:
<
<

>
>

 // Open the appropriate ROOT file. root -l myFile.root // Import the workspace.
Changed:
<
<
myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace");
>
>
myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace");
 // Print the workspace contents. myWorkspace.Print(); // Example printout:
Line: 323 to 311
 // Import the ModelConfig saved as m. ModelConfig* myModelConfig = (ModelConfig*) myWorkspace.obj("m");
Changed:
<
<

>
>

 

Visual representations of the model/PDF contents

Line: 331 to 319
  Graphviz consists of a graph description language called the DOT language and a set of tools that can generate and/or process DOT files.
Deleted:
<
<
 
Example code: examining PDFs and creating graphical representations of them
Changed:
<
<

>
>

 // Create variables and a PDF using those variables. RooRealVar mu("mu", "mu", 150); RooRealVar sigma("sigma", "sigma", 5, 0, 20);
Line: 368 to 355
  // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5
Changed:
<
<

>
>

 

Accessing the RooFit workspace

Deleted:
<
<
 
Example code: accessing the workspace
Changed:
<
<

>
>

 // Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root // Alternatively, you could open the file in a manner such as the following: myFileName = "BR5_MSSM_signal90_combined_datastat_model.root" TFile *myFile = TFile::Open(myFileName); // Import the workspace.
Changed:
<
<
RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
>
>
RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
 // Print the workspace contents.
Changed:
<
<
myWorkspace->Print();
>
>
myWorkspace->Print();
 // Import the PDF.
Changed:
<
<
RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90");
>
>
RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90");
 // Import the variable representing the observable.
Changed:
<
<
RooRealVar* myObservable = myWorkspace->var("obs");
>
>
RooRealVar* myObservable = myWorkspace->var("obs");
 // Create a RooPlot frame using the imported variable.. RooPlot* myFrame = myObservable.frame(); // Plot the PDF on the created RooPlot frame. myPDF.plotOn(myFrame); // Draw the RooPlot.
Changed:
<
<
myFrame->Draw();
>
>
myFrame->Draw();
 
Changed:
<
<

>
>

 
Deleted:
<
<
 
Example code: accessing both data and PDF from a workspace stored in a file
Changed:
<
<

>
>

 // Note that the following code is independent of actual PDF in the file. So, for example, a full Higgs combination could work with identical code.

// Open a file and import the workspace. TFile myFile("myResults.root") ; RooWorkspace* myWorkspace = f.Get("myWorkspace") ; // Plot the data and PDF

Changed:
<
<
RooPlot* xframe = w->var("x")->frame() ; w->data("d")->plotOn(xframe) ; w->pdf("g")->plotOn(xframe) ;
>
>
RooPlot* xframe = w->var("x")->frame() ; w->data("d")->plotOn(xframe) ; w->pdf("g")->plotOn(xframe) ;
 // Construct a likelihood and profile likelihood
Changed:
<
<
RooNLLVar nll("nll","nll",*myWorkspace->pdf("g"),*w->data("d")) ; RooProfileLL pll("pll","pll", nll,*myWorkspace->var("m")) ; RooPlot* myFrame = w->var("m")->frame(-1,1) ;
>
>
RooNLLVar nll("nll","nll",*myWorkspace->pdf("g"),*w->data("d")) ; RooProfileLL pll("pll","pll", nll,*myWorkspace->var("m")) ; RooPlot* myFrame = w->var("m")->frame(-1,1) ;
  pll.plotOn(myFrame) ;
Changed:
<
<
myFrame->Draw()
>
>
myFrame->Draw()
 
Changed:
<
<

>
>

 

RooStats

Line: 426 to 411
  The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.
Deleted:
<
<
 

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

Changed:
<
<

>
>

 { // In this script, a simple model is created using the Workspace Factory in RooFit. // ModelConfig is used to specify the parts of the model necessary for the statistical tools of RooStats. // A 95% confidence interval test is run using the ProfileLikelihoodCalculator of RooStats.

// Define a RooFit random seed in order to produce reproducible results.

Changed:
<
<
RooRandom::randomGenerator()->SetSeed(271);
>
>
RooRandom::randomGenerator()->SetSeed(271);
  // Make a simple model using the Workspace Factory. // Create a new workspace. RooWorkspace* myWorkspace = new RooWorkspace(); // Create the PDF G(x|mu,1) and the variables x, mu and sigma in one command using the factory syntax.
Changed:
<
<
myWorkspace->factory("Gaussian::normal(x[-10,10], mu[-1,1], sigma[1])");
>
>
myWorkspace->factory("Gaussian::normal(x[-10,10], mu[-1,1], sigma[1])");
  // Define parameter sets for observables and parameters of interest.
Changed:
<
<
myWorkspace->defineSet("poi","mu"); myWorkspace->defineSet("obs","x");
>
>
myWorkspace->defineSet("poi","mu"); myWorkspace->defineSet("obs","x");
  // Print the workspace contents.
Changed:
<
<
myWorkspace->Print() ;
>
>
myWorkspace->Print() ;
  // Specify for the statistical tools the components of the defined model. // Create a new ModelConfig. ModelConfig* myModelConfig = new ModelConfig("my G(x|mu,1)"); // Specify the workspace.
Changed:
<
<
myModelConfig->SetWorkspace(*myWorkspace);
>
>
myModelConfig->SetWorkspace(*myWorkspace);
  // Specify the PDF.
Changed:
<
<
myModelConfig->SetPdf(*myWorkspace->pdf("normal"));
>
>
myModelConfig->SetPdf(*myWorkspace->pdf("normal"));
  // Specify the parameters of interest.
Changed:
<
<
myModelConfig->SetParametersOfInterest(*myWorkspace->set("poi"));
>
>
myModelConfig->SetParametersOfInterest(*myWorkspace->set("poi"));
  // Specify the observables.
Changed:
<
<
myModelConfig->SetObservables(*myWorkspace->set("obs"));
>
>
myModelConfig->SetObservables(*myWorkspace->set("obs"));
  // Create a toy dataset. // Create a toy dataset of 100 measurements of the observables (x).
Changed:
<
<
RooDataSet* myData = myWorkspace->pdf("normal")->generate(*myWorkspace->set("obs"), 100); //myData->print();
>
>
RooDataSet* myData = myWorkspace->pdf("normal")->generate(*myWorkspace->set("obs"), 100); //myData->print();
  // Use the ProfileLikelihoodCalculator to obtain a 95% confidence interval. // Specify the confidence level required. double confidenceLevel = 0.95;
Line: 473 to 457
  LikelihoodInterval* myProfileLikelihoodInterval = myProfileLikelihoodCalculator.GetInterval(); // Use this interval result. In this case, it makes sense to say what the lower and upper limits are. // Define the object variables for the purposes of the confidence interval.
Changed:
<
<
RooRealVar* x = myWorkspace->var("x"); RooRealVar* mu = myWorkspace->var("mu"); cout << "The profile likelihood calculator interval is [ "<< myProfileLikelihoodInterval->LowerLimit(*mu) << ", " << myProfileLikelihoodInterval->UpperLimit(*mu) << "] " << endl;
>
>
RooRealVar* x = myWorkspace->var("x"); RooRealVar* mu = myWorkspace->var("mu"); cout << "The profile likelihood calculator interval is [ "<< myProfileLikelihoodInterval->LowerLimit(*mu) << ", " << myProfileLikelihoodInterval->UpperLimit(*mu) << "] " << endl;
  // Set mu equal to zero.
Changed:
<
<
mu->setVal(0);
>
>
mu->setVal(0);
  // Is mu in the interval?
Changed:
<
<
cout << "Is mu = 0 in the interval?" << endl; if (myProfileLikelihoodInterval->IsInInterval(*mu) == 1){ cout << "Yes" << endl;
>
>
cout << "Is mu = 0 in the interval?" << endl; if (myProfileLikelihoodInterval->IsInInterval(*mu) == 1){ cout << "Yes" << endl;
  } else{
Changed:
<
<
cout << "No" << endl;
>
>
cout << "No" << endl;
  } }
Changed:
<
<

>
>

 

ModelConfig

Changed:
<
<
The ModelConfig RooStats class encapsulates the configuration of a model to define a particular hypothesis. It is now used extensively by the calculator tools. ModelConfig always contains a reference to an external workspace that manages all of the objects that are a part of the model (PDFs and parameter sets). So, in order to use ModelConfig, the user must specify a workspace pointer before creating the various objects of the model.
>
>
The ModelConfig RooStats class encapsulates the configuration of a model to define a particular hypothesis. It is now used extensively by the calculator tools. ModelConfig always contains a reference to an external workspace that manages all of the objects that are a part of the model (PDFs and parameter sets). So, in order to use ModelConfig, the user must specify a workspace pointer before creating the various objects of the model.
 

HistFactory

Line: 512 to 495
 

hist2workspace

Changed:
<
<
The hist2workspace executable is used in the following manner: hist2workspace input.xml
>
>
The hist2workspace executable is used in the following manner: hist2workspace input.xml
 

XML files

Line: 543 to 525
  There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.
Deleted:
<
<
 
Example file: $ROOTSYS/tutorials/histfactory/example.xml
Changed:
<
<


>
>

 
<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>

<Combination OutputFilePrefix="./results/example" Mode="comb" >
Line: 578 to 559
 
Changed:
<
<
>
>
 

Channel XML files

Line: 610 to 591
 HistoNameHigh="myShapeSystematic_1_high" HistoNameLow="myShapeSystematic_1_low" />
Deleted:
<
<
 
Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
Changed:
<
<


>
>

 
<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>

  <Channel Name="channel1" InputFile="./data/example.root" HistoName="" >
Line: 630 to 610
 
Changed:
<
<
>
>
 

Analysis!

Line: 658 to 638
  HistFactory manual
Changed:
<
<
E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch
>
>
E-group for support with ATLAS-sensitive information: atlas-phys-stat-root@cern.ch
 
Changed:
<
<
E-mail support for software issues, bugs etc.: roostats-development@cern.ch
>
>
E-mail support for software issues, bugs etc.: roostats-development@cern.ch

Other links

Exotics Working Group statistics tutorial XML reference

Exotics Working Group statistics tutorial workspace examples

XML example

 

Revision 252011-08-28 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-19
>
>
-- WilliamBreadenMadden - 2011-08-28
 
Line: 632 to 632
 
Added:
>
>

Analysis!

ATLAS recommends the use of the profile likelihood as a test statistic.

 

Further information

ROOT links:

Revision 242011-08-19 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-18
>
>
-- WilliamBreadenMadden - 2011-08-19
 
Line: 24 to 24
  There are instructions on how to use the different versions of ROOT at Glasgow here.
Added:
>
>
To check what version of ROOT is running, use the following command:

root -v -b

 

Setting up RooStats

There are three main options available for acquiring ROOT with RooStats included.

Line: 509 to 515
 The hist2workspace executable is used in the following manner: hist2workspace input.xml
Changed:
<
<

Top-Level XML file

>
>

XML files

 

General description

Added:
>
>
A minimum of two XML files are required for configuration. The "top-level" XML file defines the measurement and contains a list of channels that contribute to this measurement. "channel" XML files are used to describe each channel in detail. For each contributing channel, there is a separate XML file.

Top-Level XML file

General description
 This file specifies a top level 'Combination' that is composed of:
  • several 'Channels', which are described in separate XML files.
  • several 'Measurements' (corresponding to a full fit of the model), each of which specifies
Line: 527 to 539
 
    • whether the tool should export the model only and skip the default fit.
Changed:
<
<

Specific instructions

>
>
Specific instructions
  There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.

Changed:
<
<

Example file: $ROOTSYS/tutorials/histfactory/example.xml

>
>
Example file: $ROOTSYS/tutorials/histfactory/example.xml
 

<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>
Line: 568 to 580
 
Changed:
<
<

Channel XML files

>
>

Channel XML files

 
Changed:
<
<

General description

>
>
General description
  These files specify for each channel
  • observed data (if absent, the tool will use the expectation, which is useful for expected sensitivity)
Line: 586 to 598
 
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
Changed:
<
<

Specific instructions

>
>
Specific instructions

First, the channel file specifies the XML schema. Then, the channel is defined and named. The location of the data histogram for the channel is defined. For each background, a Sample is defined. It is specified whether it is normalised to luminosity (i.e., the histograms should be per inverse picobarn and will be scaled). To enable normalisation to luminosity, set the tag attribute NormalizeByTheory to "True". For external normalisation, a data-driven background measurement is fixed to the lumi of the dataset. In this case, set the tag attribute NormalizeByTheory to "False". For the normalisation factor, (e.g., "SigXsecOverSM"), the tag attribute "Name" should match the POI specified in the top level XML configuration file.

Systematic uncertainties
 
Changed:
<
<
* text *
>
>
For an overall relative rate systematic, the "OverallSys" tag is used with its appropriate tag attributes. For a shape systematic (a systematic that affects the shape of a histogram), the "HistoSys" tag is used with its appropriate tag attributes. Specifically, for the HistoSys tag attributes "HistoNameHigh" and "HistoNameLow", the respective histograms for the upper and lower shape systematic uncertainties are specified in a manner such as the following:

<HistoSys Name="myShapeSystematic_1" HistoNameHigh="myShapeSystematic_1_high" HistoNameLow="myShapeSystematic_1_low" />
 
Changed:
<
<

Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml

>
>
Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml
 

<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>

Revision 232011-08-18 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-16
>
>
-- WilliamBreadenMadden - 2011-08-18
 
Line: 178 to 178
 

Added:
>
>

Datasets

General description

In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.

In general, working in RooFit with binned and unbinned data is very similar, as both class RooDataSet (for unbinned data) and class RooDataHist (for binned data) inherit from a common base class, RooAbsData, which defines the interface for a generic abstract data sample. With few exceptions, all RooFit methods take abstract datasets as input arguments, allowing for the interchangeable use of binned and unbinned data.

RooDataSet (unbinned data)

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

} // Create a RooDataSet and fill it with generated toy Monte Carlo data: RooDataSet* myData = gauss.generate(x, 100); // Plot the dataset. RooPlot* myFrame = x.frame(); myData.plotOn()(myFrame); myFrame.Draw(); }

Plotting unbinned data is similar to plotting binned data with the exception that one can show it in some preferred binning.

Example code: plotting unbinned data (a RooDataSet) using a specified binning

RooPlot* myFrame = x.frame() ; myData.plotOn(myFrame, Binning(25)) ; myFrame->Draw()

Importing data from ROOT trees (how to populate RooDataSets from TTrees)

* Put stuff here, Will. *

RooDataHist (binned data)

Importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set) (how to populate RooDataHists from histograms)

In RooFit, binned data is represented by the RooDataHist class. The contents of a ROOT histogram can be imported into a RooDataHist object. In importing a ROOT histogram, the binning of the original histogram is imported as well. A RooDataHist associates the histogram with a RooFit variable object of type RooRealVar. In this way it is always known what kind of data is stored in the histogram.

In displaying the data, RooFit, by default, shows the 68% confidence interval for Poisson statistics.

Example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)

{ // Access the file. TFile* myFile = new TFile("myFile.root"); // Load the histogram. TH1* myHistogram = (TH1*) myFile->Get("myHistogram"); // Draw the loaded histogram. myHistogram.Draw(); // Declare an observable x. RooRealVar x("x", "x", -1, 2); // Create a binned dataset that imports the contents of TH1 and associates its contents to observable 'x'. RooDataHist myData("myData", "myData", RooArgList(x), myHistogram); // Plot the imported dataset. RooPlot* myFrame = x.frame(); myData.plotOn(myFrame) myFrame.Draw() }

Fitting

Fitting a model to data

Fitting a model to data can be done in many ways. The most common methods are the χ2 fit and the -log(L) fit. The default fitting method in ROOT is the χ2 method, while the default method in RooFit is the -log(L) method. The -log(L) method is often preferred because it is more robust for low statistics fits and because it can also be performed on unbinned data.

Fitting a PDF to unbinned data

Example code: fit a Gaussian PDF to data

// Fit gauss to unbinned data gauss.fitTo(*myData);

 

The RooFit workspace

General description

Line: 326 to 414
 

RooStats

Changed:
<
<

General description
>
>

General description

  RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.

Deleted:
<
<

Datasets

In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

} // Create a RooDataSet and fill it with generated toy Monte Carlo data: RooDataSet* myData = gauss.generate(x, 100); // Plot the dataset. RooPlot* myFrame = x.frame(); myData.plotOn()(myFrame); myFrame.Draw(); }

Importing data (how to populate datasets from histograms and TTrees)

Importing data from ROOT trees

* Put stuff here, Will. *

Importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set)

Example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)

{ // Access the file. TFile* myFile = new TFile("myFile.root"); // Load the histogram. TH1* myHistogram = myFile.Get("myHistogram"); // Draw the loaded histogram. myHistogram.Draw(); // Declare an observable x. RooRealVar x("x", "x", -1, 2); // Create a binned dataset that imports the contents of TH1 and associates its contents to observable 'x'. RooDataHist myData("myData", "myData", RooArgList(x), myHistogram); // Plot the imported dataset. RooPlot* myFrame = x.frame(); myData.plotOn(myFrame) myFrame.Draw() }

Fitting

Fitting a PDF to unbinned data

Example code: fit a Gaussian PDF to data

// Fit gauss to unbinned data gauss.fitTo(*myData);

 

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

Revision 222011-08-16 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-15
>
>
-- WilliamBreadenMadden - 2011-08-16
 
Line: 467 to 467
 

HistFactory

Added:
>
>

General description

The HistFactory can be used to run an analysis without being a RooFit/RooStats expert. In a nutshell, ROOT files containing input histograms are set up and XML files are set up for those input ROOT files. The XML files specify details on the histograms and specify how RooFit should interpret the information in the files. An little executable contained in ROOT called hist2workspace is used to import the histograms into a RooFit workspace appropriately.

 

prepareHistFactory

The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directories. prepareHistFactory prepares a working area. It creates results/, data/ and config/ directories. It copies the HistFactorySchema.dtd and example XML files into the config/ directory. Also, it copies a ROOT file into the data/ directory for use with the examples.

HistFactorySchema.dtd

Changed:
<
<
HistFactorySchema.dtd: This file is located in $ROOTSYS/etc/ specifies the XML schema. It is typically placed in the config/ direc-tory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
>
>
HistFactorySchema.dtd: This file is located in $ROOTSYS/etc/ specifies the XML schema. It is typically placed in the config/ directory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.
 

hist2workspace

Revision 212011-08-15 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-15
Line: 226 to 226
  RooAbsPdf* myPDF = myWorkspace.pdf("g"); // Import the data saved as d. RooAbsData* myData = myWorkspace.data("d");
Added:
>
>
// Import the ModelConfig saved as m. ModelConfig* myModelConfig = (ModelConfig*) myWorkspace.obj("m");
 

Revision 202011-08-15 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-11
>
>
-- WilliamBreadenMadden - 2011-08-15
 
Line: 122 to 122
 
list of space points RooAbsData
integral RooRealIntegral
Changed:
<
<
Composite functions correspond to composite objects.
>
>
Composite functions correspond to composite objects. The ArgSet class is dependent on argument order while the ArgList class is not.
 

Example code: defining a RooFit variable

Revision 192011-08-15 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-11
Line: 255 to 255
  // RooGaussian::G[ x=x mean=mu sigma=sigma ] = 1 // sigma.Print() // RooRealVar::sigma = 5 L(0 - 20)
Added:
>
>
// Print the PDF contents in a detailed manner. myGaussianPDF.Print("verbose");
 // Print the PDF contents to stdout. myGaussianPDF.Print("t"); // Example output:

Revision 182011-08-11 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-11
Line: 496 to 496
 

Specific instructions

Changed:
<
<
Specify which bins to use inclusively using the Measurement tag attributes BinLow and BinHigh.
>
>
There are several parts in the top-level XML file. Initially, the XML schema is specified. The output is specified next. Specifically, the prefix for any output files (ROOT files containing workspaces) is given. Now, the channel XML files are given for each measurement (e.g., signal, background, systematics information) using the Input tag. Next, the details for each channel are defined using the Measurement tag. Which bins to use are specified inclusively using the Measurement tag attributes BinLow and BinHigh. Within the Measurement tag, the POI tag is used to specify the point of interest for the channel, i.e. the test statistic.
 

Example file: $ROOTSYS/tutorials/histfactory/example.xml

Line: 553 to 553
 
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
Added:
>
>

Specific instructions

* text *

 

Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml



Revision 172011-08-11 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-11
Line: 6 to 6
 

Higgs analysis at ATLAS using RooStats

Added:
>
>
* THIS PAGE IS UNDER CONSTRUCTION *
 This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.

A note on code and formatting

Changed:
<
<
Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a green-text-on-black-background console style. Scripts are given grey highlighting. So, in a nutshell, code segments are in yellow, while full scripts are in terminal. In code examples given, user-created objects are generally prefaced with "my", for example, "myData" for the purposes of clarity.
>
>
Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a classical green-text-on-black-background console style. Scripts are given grey highlighting. File contents are given the golden verbatim colouring. So, in a nutshell, code segments are in yellow, full ROOT scripts are in terminal, general scripts are in grey and file contents are in gold. In code examples given, user-created objects are generally prefaced with "my", for example, "myData", for the purposes of clarity.
 

What is RooStats?

Line: 22 to 24
  There are instructions on how to use the different versions of ROOT at Glasgow here.
Changed:
<
<

Personally acquiring RooStats

>
>

Setting up RooStats

  There are three main options available for acquiring ROOT with RooStats included.
Line: 32 to 34
 

Option 2: Build the ROOT trunk from source.

Changed:
<
<
ftp://root.cern.ch/root/root_v5.27.06.source.tar.gz

Follow the appropriate instructions here to build it.

>
>
Follow the appropriate instructions here to build the ROOT trunk.
 

Shell script: building ROOT with RooFit and RooStats

Line: 104 to 104
 

RooFit

Added:
>
>

General description

 The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.

Changed:
<
<
The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions F(x;p) that describe the probability density the distribution of observables x in terms of function in parameter p.
>
>
The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions (PDFs), F(x;p), that describe the probability density of the distribution of observables x in terms of the function parameter p.
  In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identified for the object while the title of an object is a more elaborate description of the object.
Line: 120 to 122
 
list of space points RooAbsData
integral RooRealIntegral
Added:
>
>
Composite functions correspond to composite objects.
 

Example code: defining a RooFit variable

Line: 130 to 134
 

Added:
>
>

RooPlot

 A RooPlot is essentially an empty frame that is capable of holding anything plotted verses its variable.
Changed:
<
<

PDFs

>
>

PDFs

 
Changed:
<
<
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one doing performing the integration on in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of teh model. RooFit doesn't know that x is not special; x, the mean and sigma are all on equal footing. You can tell it that x is the variable to normalise over.
>
>
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one integrating in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of the model. RooFit doesn't know that x is special; x, the mean and sigma are all on equal footing. You can tell RooFit that x is the variable to normalise over.
  So, RooGaussian has no intrinsic notion of distinction between observables and parameters. The choice of observables (for unit normalisation) is always passed to gauss.getVal().

What is the value of this? In a nutshell, it allows one to do Bayesian stuff very easily.

Changed:
<
<
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch who it is a probability density function of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. You picked up a Jacobian factor in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
>
>
Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch what the probability density function is of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. A Jacobian factor is picked up in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.
 
Changed:
<
<

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

>
>

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

 

{ // Build a Gaussian PDF.

Line: 162 to 168
 

Changed:
<
<

Example code: telling a RooFit PDF what to normalise over

>
>

Example code: telling a RooFit PDF what to normalise over

 

Not normalised (i.e., this is not a PDF): gauss.getVal();

Line: 172 to 178
 

Changed:
<
<
Composite functions correspond to composite objects.
>
>

The RooFit workspace

 
Changed:
<
<

The RooFit workspace

>
>

General description

  The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
Changed:
<
<
One might create a Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian shall be drawn in and owned by the workspace. There are no nightmarish ownership problems. Alternatively, one might simply create the Gaussian inside the workspace using the Workspace Factory.
>
>
One might create a Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian shall be drawn in and owned by the workspace. There are no nightmarish ownership problems. Alternatively, one might create simply the Gaussian inside the workspace using the "Workspace Factory".
 
Changed:
<
<

Example code: using the Workspace Factory to create a Gaussian PDF

>
>

Example code: using the Workspace Factory to create a Gaussian PDF

 

// Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace");

Line: 192 to 198
 

What's in the RooFit workspace?

Changed:
<
<

Example code: What's in the workspace?

>
>

Example code: What's in the workspace?

 

// Open the appropriate ROOT file. root -l myFile.root

Line: 223 to 229
 

Changed:
<
<

Visual representations of the model/PDF contents

>
>

Visual representations of the model/PDF contents

 
Changed:
<
<

Graphviz

>
>
Graphviz
  Graphviz consists of a graph description language called the DOT language and a set of tools that can generate and/or process DOT files.

Changed:
<
<

Example code: examining PDFs and creating graphical representations of them

>
>
Example code: examining PDFs and creating graphical representations of them
 

// Create variables and a PDF using those variables. RooRealVar mu("mu", "mu", 150);

Line: 266 to 272
 

Changed:
<
<

Accessing the RooFit workspace

>
>

Accessing the RooFit workspace

 
Changed:
<
<

Example code: accessing the workspace

>
>
Example code: accessing the workspace
 

// Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root

Line: 294 to 300
 

Changed:
<
<

Example code: Using both data and PDF from file

>
>
Example code: accessing both data and PDF from a workspace stored in a file
 

// Note that the following code is independent of actual PDF in the file. So, for example, a full Higgs combination could work with identical code.

Line: 316 to 322
 

RooStats

Added:
>
>

General description
 RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.

Datasets

Changed:
<
<
In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can defined how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
>
>
In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can define how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.
 
Changed:
<
<

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

>
>

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

 

} // Create a RooDataSet and fill it with generated toy Monte Carlo data:

Line: 338 to 346
 

Changed:
<
<

Importing data (how to populate datasets from histograms and TTrees

>
>

Importing data (how to populate datasets from histograms and TTrees)

 

Importing data from ROOT trees

Added:
>
>
* Put stuff here, Will. *
 

Importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set)

Changed:
<
<

Fitting a PDF to unbinned data

>
>
Example code: import a ROOT histogram into a RooDataHist (a RooFit binned dataset)

{ // Access the file. TFile* myFile = new TFile("myFile.root"); // Load the histogram. TH1* myHistogram = myFile.Get("myHistogram"); // Draw the loaded histogram. myHistogram.Draw(); // Declare an observable x. RooRealVar x("x", "x", -1, 2); // Create a binned dataset that imports the contents of TH1 and associates its contents to observable 'x'. RooDataHist myData("myData", "myData", RooArgList(x), myHistogram); // Plot the imported dataset. RooPlot* myFrame = x.frame(); myData.plotOn(myFrame) myFrame.Draw() }

Fitting

Fitting a PDF to unbinned data

 
Changed:
<
<

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

>
>
Example code: fit a Gaussian PDF to data
 

// Fit gauss to unbinned data gauss.fitTo(*myData);

Line: 355 to 392
 

Changed:
<
<

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

>
>

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

 

{ // In this script, a simple model is created using the Workspace Factory in RooFit.

Line: 441 to 478
 

Top-Level XML file

Changed:
<
<
Example: $ROOTSYS/tutorials/histfactory/example.xml
>
>

General description

  This file specifies a top level 'Combination' that is composed of:
  • several 'Channels', which are described in separate XML files.
Line: 457 to 494
 
    • whether the tool should export the model only and skip the default fit.
Added:
>
>

Specific instructions

Specify which bins to use inclusively using the Measurement tag attributes BinLow and BinHigh.

Example file: $ROOTSYS/tutorials/histfactory/example.xml


<!DOCTYPE Combination  SYSTEM 'HistFactorySchema.dtd'>

<Combination OutputFilePrefix="./results/example" Mode="comb" >

  <Input>./config/example_channel.xml</Input>

  <Measurement Name="GaussExample" Lumi="1." LumiRelErr="0.1" BinLow="0" BinHigh="2" Mode="comb" >
    <POI>SigXsecOverSM</POI>
    <ParamSetting Const="True">Lumi alpha_syst1</ParamSetting>
    <!-- don't need <ConstraintTerm> default is Gaussian-->
  </Measurement>

  <Measurement Name="GammaExample" Lumi="1." LumiRelErr="0.1" BinLow="0" BinHigh="2" Mode="comb" >
    <POI>SigXsecOverSM</POI>
    <ParamSetting Const="True">Lumi alpha_syst1</ParamSetting>
    <ConstraintTerm Type="Gamma" RelativeUncertainty=".3">syst2</ConstraintTerm>
  </Measurement>

  <Measurement Name="LogNormExample" Lumi="1." LumiRelErr="0.1" BinLow="0" BinHigh="2" Mode="comb" >
    <POI>SigXsecOverSM</POI>
    <ParamSetting Const="True">Lumi alpha_syst1</ParamSetting>
    <ConstraintTerm Type="LogNormal" RelativeUncertainty=".3">syst2</ConstraintTerm>
  </Measurement>

  <Measurement Name="ConstExample" Lumi="1." LumiRelErr="0.1" BinLow="0" BinHigh="2" Mode="comb" ExportOnly="True">
    <POI>SigXsecOverSM</POI>
    <ParamSetting Const="True">Lumi alpha_syst1</ParamSetting>
  </Measurement>


</Combination>
 

Channel XML files

Changed:
<
<
Example: $ROOTSYS/tutorials/histfactory/example_channel.xml
>
>

General description

 
Changed:
<
<
This file specifies for each channel
>
>
These files specify for each channel
 
  • observed data (if absent, the tool will use the expectation, which is useful for expected sensitivity)
  • several 'Samples' (e.g., signal, bkg1, bkg2 etc.), each of which specifies
    • a name.
Line: 475 to 553
 
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
Added:
>
>

Example file: $ROOTSYS/tutorials/histfactory/example_channel.xml


<!DOCTYPE Channel  SYSTEM 'HistFactorySchema.dtd'>

  <Channel Name="channel1" InputFile="./data/example.root" HistoName="" >
    <Data HistoName="data" HistoPath="" />
    <Sample Name="signal" HistoPath="" HistoName="signal">
      <OverallSys Name="syst1" High="1.05" Low="0.95"/>
      <NormFactor Name="SigXsecOverSM" Val="1" Low="0." High="3." Const="True" />
    </Sample>
    <Sample Name="background1" HistoPath="" NormalizeByTheory="True" HistoName="background1">
      <OverallSys Name="syst2" Low="0.95" High="1.05"/>
    </Sample>
    <Sample Name="background2" HistoPath="" NormalizeByTheory="True" HistoName="background2">
      <OverallSys Name="syst3" Low="0.95" High="1.05"/>
      <!-- <HistoSys Name="syst4" HistoPathHigh="" HistoPathLow="histForSyst4"/>-->
    </Sample>
  </Channel>
 

Further information

Added:
>
>

ROOT links:

ROOT User's Guide

 

RooFit links

User's Manual

Line: 494 to 598
 HistFactory manual

E-group for support with ATLAS-sensitive information:

Changed:
<
<
<atlas-phys-stat-root@cern.ch>
>
>
atlas-phys-stat-root@cern.ch
  E-mail support for software issues, bugs etc.:
Changed:
<
<
<roostats-development@cern.ch>

ROOT links:

ROOT User's Guide

>
>
roostats-development@cern.ch
 

Revision 162011-08-11 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-10
>
>
-- WilliamBreadenMadden - 2011-08-11
 
Line: 419 to 419
 

Added:
>
>

ModelConfig

The ModelConfig RooStats class encapsulates the configuration of a model to define a particular hypothesis. It is now used extensively by the calculator tools. ModelConfig always contains a reference to an external workspace that manages all of the objects that are a part of the model (PDFs and parameter sets). So, in order to use ModelConfig, the user must specify a workspace pointer before creating the various objects of the model.

HistFactory

prepareHistFactory

The ROOT release ships with a script called prepareHistFactory and a binary file called hist2workspace in the $ROOTSYS/bin directories. prepareHistFactory prepares a working area. It creates results/, data/ and config/ directories. It copies the HistFactorySchema.dtd and example XML files into the config/ directory. Also, it copies a ROOT file into the data/ directory for use with the examples.

HistFactorySchema.dtd

HistFactorySchema.dtd: This file is located in $ROOTSYS/etc/ specifies the XML schema. It is typically placed in the config/ direc-tory of a working area together with the top-level XML file and the individual channel XML files. The user should not modify this file. The HistFactorySchema.dtd is commented to specify exactly the meaning of the various options.

hist2workspace

The hist2workspace executable is used in the following manner: hist2workspace input.xml

Top-Level XML file

Example: $ROOTSYS/tutorials/histfactory/example.xml

This file specifies a top level 'Combination' that is composed of:

  • several 'Channels', which are described in separate XML files.
  • several 'Measurements' (corresponding to a full fit of the model), each of which specifies
    • a name for this measurement to be used in tables and files.
    • the luminosity associated with the measurement in picobarns.
    • which bins of the histogram should be used.
    • what the relative uncertainty on the luminosity is.
    • what is(/are) the parameter(/s) of interest that will be measured.
    • which parameter(/s) should be fixed/floating (e.g., nuisance parameters)
    • which type of constriants are desired:
      • default: Gaussian
      • supported: Gamma, LogNormal, Uniform
    • whether the tool should export the model only and skip the default fit.

Channel XML files

Example: $ROOTSYS/tutorials/histfactory/example_channel.xml

This file specifies for each channel

  • observed data (if absent, the tool will use the expectation, which is useful for expected sensitivity)
  • several 'Samples' (e.g., signal, bkg1, bkg2 etc.), each of which specifies
    • a name.
    • whether the sample is normalized by theory (e.g., N = L * sigma) or whether the sample is data driven.
    • a nominal expectation histogram.
    • a named 'Normalization Factor' (which can be fixed or allowed to float in a fit).
    • several 'Overall Systematics' in normalization with
      • a name.
      • +/- 1 sigma variations (e.g., 1.05 and 0.95 for a 5% uncertainty).
    • several 'Histogram Systematics' in shape with
      • a name (which can be shared with the OverallSyst if correlated).
      • +/- 1 sigma variational histograms.
 

Further information

Changed:
<
<

RooFit links:

>
>

RooFit links

  User's Manual

Tutorials

Changed:
<
<

RooStats links:

>
>

RooStats links

  Wiki

Revision 152011-08-10 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-08-03
>
>
-- WilliamBreadenMadden - 2011-08-10
 
Line: 189 to 189
 

Added:
>
>

What's in the RooFit workspace?

 

Example code: What's in the workspace?

Line: 221 to 223
 

Added:
>
>

Visual representations of the model/PDF contents

Graphviz

Graphviz consists of a graph description language called the DOT language and a set of tools that can generate and/or process DOT files.

Example code: examining PDFs and creating graphical representations of them

// Create variables and a PDF using those variables. RooRealVar mu("mu", "mu", 150); RooRealVar sigma("sigma", "sigma", 5, 0, 20); RooGaussian myGaussianPDF("myGaussianPDF", "Gaussian PDF", x, mu, sigma); // Create a Graphviz DOT file with a representation of the object tree. myGaussianPDF.graphVizTree("myGaussianPDFTree.dot"); // This produced DOT file can be converted to some graphical representation: // Convert the DOT file to a 'top-to-bottom graph' using UNIX commands: // dot -Tgif -o myGaussianPDF_top-to-bottom_graph.gif myGaussianPDFTree.dot // Convert the DOT file to a 'spring-model graph' using UNIX commands: // fdp -Tgif -o myGaussianPDF_spring-model_graph.gif myGaussianPDFTree.dot // Print the PDF contents. myGaussianPDF.Print(); // Example output: // RooGaussian::G[ x=x mean=mu sigma=sigma ] = 1 // sigma.Print() // RooRealVar::sigma = 5 L(0 - 20) // Print the PDF contents to stdout. myGaussianPDF.Print("t"); // Example output: // 0x166eab0 RooGaussian::G = 1 [Auto] // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5 // Print the PDF contents to a file. myGaussianPDF.printCompactTree("", "myGaussianPDFTree.txt") // Example output file contents: // 0x166eab0 RooGaussian::G = 1 [Auto] // 0x15f7fe0/V- RooRealVar::x = 150 // 0x1487090/V- RooRealVar::mu = 150 // 0x1487bc0/V- RooRealVar::sigma = 5

Accessing the RooFit workspace

 

Example code: accessing the workspace

// Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root

Added:
>
>
// Alternatively, you could open the file in a manner such as the following: myFileName = "BR5_MSSM_signal90_combined_datastat_model.root" TFile *myFile = TFile::Open(myFileName);
 // Import the workspace.
Changed:
<
<
myWorkspace = (RooWorkspace*) _file0->Get("combined");
>
>
RooWorkspace* myWorkspace = (RooWorkspace*) _file0->Get("combined");
 // Print the workspace contents. myWorkspace->Print(); // Import the PDF.
Line: 243 to 293
 

Added:
>
>

Example code: Using both data and PDF from file

// Note that the following code is independent of actual PDF in the file. So, for example, a full Higgs combination could work with identical code. // Open a file and import the workspace. TFile myFile("myResults.root") ; RooWorkspace* myWorkspace = f.Get("myWorkspace") ; // Plot the data and PDF RooPlot* xframe = w->var("x")->frame() ; w->data("d")->plotOn(xframe) ; w->pdf("g")->plotOn(xframe) ; // Construct a likelihood and profile likelihood RooNLLVar nll("nll","nll",*myWorkspace->pdf("g"),*w->data("d")) ; RooProfileLL pll("pll","pll", nll,*myWorkspace->var("m")) ; RooPlot* myFrame = w->var("m")->frame(-1,1) ; pll.plotOn(myFrame) ; myFrame->Draw()

 

RooStats

RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

Revision 142011-08-05 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-03
Line: 174 to 174
  Composite functions correspond to composite objects.
Changed:
<
<

The RooFit workspace

>
>

The RooFit workspace

  The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
Line: 197 to 198
 myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace"); // Print the workspace contents. myWorkspace.Print();
Added:
>
>
// Example printout:

variables --------- (x,m,s)

p.d.f.s ------- RooGaussian::g[ x=x mean=m sigma=s ] = 0

datasets -------- RooDataSet::d(x)

// Import the variable saved as x. RooRealVar* myVariable = myWorkspace.var("x"); // Import the PDF saved as g. RooAbsPdf* myPDF = myWorkspace.pdf("g"); // Import the data saved as d. RooAbsData* myData = myWorkspace.data("d");

 

Line: 210 to 232
 myWorkspace->Print(); // Import the PDF. RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90");
Changed:
<
<
// Import the variable for the observable.
>
>
// Import the variable representing the observable.
 RooRealVar* myObservable = myWorkspace->var("obs"); // Create a RooPlot frame using the imported variable.. RooPlot* myFrame = myObservable.frame();
Line: 218 to 240
 myPDF.plotOn(myFrame); // Draw the RooPlot. myFrame->Draw();
Deleted:
<
<

 
Added:
>
>

 

RooStats

Revision 132011-08-05 - MichaelWright

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-03

Revision 122011-08-05 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-03
Line: 186 to 186
 // Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace"); myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");
Added:
>
>

Example code: What's in the workspace?

// Open the appropriate ROOT file. root -l myFile.root // Import the workspace. myWorkspace = (RooWorkspace*) _file0->Get("myWorkspace"); // Print the workspace contents. myWorkspace.Print();

 
Added:
>
>

Example code: accessing the workspace

// Open the appropriate ROOT file. root -l BR5_MSSM_signal90_combined_datastat_model.root // Import the workspace. myWorkspace = (RooWorkspace*) _file0->Get("combined"); // Print the workspace contents. myWorkspace->Print(); // Import the PDF. RooAbsPdf* myPDF = myWorkspace->pdf("model_BR5_MSSM_signal90"); // Import the variable for the observable. RooRealVar* myObservable = myWorkspace->var("obs"); // Create a RooPlot frame using the imported variable.. RooPlot* myFrame = myObservable.frame(); // Plot the PDF on the created RooPlot frame. myPDF.plotOn(myFrame); // Draw the RooPlot. myFrame->Draw();

 

Added:
>
>
 

RooStats

RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

Revision 112011-08-05 - MichaelWright

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-08-03

Revision 102011-08-03 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2011-07-11
>
>
-- WilliamBreadenMadden - 2011-08-03
 
Line: 306 to 306
  Wiki
Changed:
<
<
RooStats User's Guide (draft)
>
>
RooStats User's Guide
  Tutorials
Added:
>
>
HistFactory manual

E-group for support with ATLAS-sensitive information: <atlas-phys-stat-root@cern.ch>

E-mail support for software issues, bugs etc.: <roostats-development@cern.ch>

 

ROOT links:

ROOT User's Guide

Revision 92011-07-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-07-11
Line: 134 to 134
 

PDFs

Changed:
<
<
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one doing performing the integration on in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of teh model. RooFit doesn't know that x is not special; x, the mean and sigma are all on equal footing.
>
>
One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one doing performing the integration on in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of teh model. RooFit doesn't know that x is not special; x, the mean and sigma are all on equal footing. You can tell it that x is the variable to normalise over.

So, RooGaussian has no intrinsic notion of distinction between observables and parameters. The choice of observables (for unit normalisation) is always passed to gauss.getVal().

What is the value of this? In a nutshell, it allows one to do Bayesian stuff very easily.

Bayes' theorem holds that the probability of b given a is related to the probability of a given b. Normally, one might say that the mean of a Gaussian is 1 and that then gives a distribution for x. However, if one had a dataset for x, one might want to know the posterior for the mean. The RooFit ability to switch who it is a probability density function of is very useful for Bayesian stuff. Because RooFit allows this composition, the thing that acts as x in the Gaussian could actually be a function of, say, x^2. You picked up a Jacobian factor in going from x to x^2, so the normalisation no longer makes sense. Sometimes RooFit can figure out what the Jacobian factor should be and sometimes it resorts to numerical integration.

 

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

Line: 155 to 161
 

Added:
>
>

Example code: telling a RooFit PDF what to normalise over

Not normalised (i.e., this is not a PDF): gauss.getVal(); Hey, RooFit! This is the thing to normalise over (i.e., guarantees Int[xmin, xmax] Gauss(x, m, s)dx == 1): gauss.getVal(x); What is the value if sigma is considered the observable? (i.e., guarantees Int[smin, smax] Gauss(x, m, s)ds == 1):

 Composite functions correspond to composite objects.

The RooFit workspace

Line: 196 to 213
 

Added:
>
>

Importing data (how to populate datasets from histograms and TTrees

Importing data from ROOT trees

Importing data from ROOT TH histogram objects (take a histogram and map it to a binned data set)

 

Fitting a PDF to unbinned data

Revision 82011-07-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-07-11

Revision 72011-07-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-07-11
Line: 10 to 10
 
Changed:
<
<

A note on formatting

>
>

A note on code and formatting

 
Changed:
<
<
Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a green-text-on-black-background console style. Scripts are given grey highlighting.
>
>
Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a green-text-on-black-background console style. Scripts are given grey highlighting. So, in a nutshell, code segments are in yellow, while full scripts are in terminal. In code examples given, user-created objects are generally prefaced with "my", for example, "myData" for the purposes of clarity.
 

What is RooStats?

Line: 130 to 130
 

Added:
>
>
A RooPlot is essentially an empty frame that is capable of holding anything plotted verses its variable.

PDFs

One of the things that makes the RooFit PDFs nice and flexible (but perhaps counterintuitive) is that they have no idea what is considered the observable. For example, a Gaussian has x, the mean and sigma while the PDF is always normalised to unity. What is one doing performing the integration on in order to get 1? For the Gaussian, it is x, by convention; the mean and sigma are parameters of teh model. RooFit doesn't know that x is not special; x, the mean and sigma are all on equal footing.

 
Changed:
<
<

Example code: create a Gaussian PDF using RooStats and plot it

>
>

Example code: create a Gaussian PDF using RooStats and plot it using the RooPlot class

 

{ // Build a Gaussian PDF.

Line: 155 to 161
  The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
Added:
>
>
One might create a Gaussian PDF and then import it into a workspace. All of the dependencies of the Gaussian shall be drawn in and owned by the workspace. There are no nightmarish ownership problems. Alternatively, one might simply create the Gaussian inside the workspace using the Workspace Factory.

Example code: using the Workspace Factory to create a Gaussian PDF

// Create a Gaussian PDF using the Workspace Factory (this is essentially the shorthand for creating a Gaussian). RooWorkspace* myWorkspace = new RooWorkspace("myWorkspace"); myWorkspace->factory("Gaussian::g(x[-5, 5], mu[0], sigma[1]");

 

RooStats

RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.

The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.

Added:
>
>

Datasets

In RooFit, data can be stored in an unbinned or binned manner. Unbinned data is stored using the RooDataSet class while binned data is stored using the RooDataHist class. The user can defined how many bins there are in a variable. For the purposes of plotting using the RooPlot class, a RooDataSet is binned into a histogram.

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

} // Create a RooDataSet and fill it with generated toy Monte Carlo data: RooDataSet* myData = gauss.generate(x, 100); // Plot the dataset. RooPlot* myFrame = x.frame(); myData.plotOn()(myFrame); myFrame.Draw(); }

Fitting a PDF to unbinned data

Example code: generating toy Monte Carlo, storing it as unbinned data and then plotting it

// Fit gauss to unbinned data gauss.fitTo(*myData);

 

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

Revision 62011-07-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-07-11
Changed:
<
<

Higgs analysis at ATLAS using RooStats

>
>
 
Changed:
<
<
This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.
>
>

Higgs analysis at ATLAS using RooStats

This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.

 
Line: 12 to 14
  Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a green-text-on-black-background console style. Scripts are given grey highlighting.
Changed:
<
<

What is RooStats?

>
>

What is RooStats?

 
Changed:
<
<
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project is developing quickly.
>
>
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project is developing quickly.
 

Using the appropriate version of ROOT at Glasgow

There are instructions on how to use the different versions of ROOT at Glasgow here.

Changed:
<
<

Personally acquiring RooStats

>
>

Personally acquiring RooStats

 
Changed:
<
<
There are three main options available for acquiring ROOT with RooStats included.
>
>
There are three main options available for acquiring ROOT with RooStats included.
 

Option 1: Download the latest ROOT release binaries.

Line: 34 to 36
  Follow the appropriate instructions here to build it.
Deleted:
<
<
Example script: building ROOT from source in Ubuntu
 
Changed:
<
<

Shell script: building ROOT with RooFit and RooStats

#!/bin/bash
>
>

Shell script: building ROOT with RooFit and RooStats

#!/bin/bash

 
Changed:
<
<
# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.
>
>
# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.
  # First, the ROOT prerequisites are installed, # then, the most common ROOT optional packages are installed.
Line: 92 to 93
  # The following line could be added to the ~/.bashrc file: # export PATH=$PATH:/home/wbm/root/bin
Deleted:
<
<
 
Changed:
<
<

Option 3: Build the RooStats branch.

>
>

Option 3: Build the RooStats branch.

 
Changed:
<
<
Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).
>
>
Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).
  The necessary instructions can be found here.
Changed:
<
<

RooFit

>
>

RooFit

 
Changed:
<
<
The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.
>
>
The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.
 
Changed:
<
<
The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions F(x;p) that describe the probability density the distribution of observables x in terms of function in parameter p.
>
>
The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions F(x;p) that describe the probability density the distribution of observables x in terms of function in parameter p.
 
Changed:
<
<
In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identified for the object while the title of an object is a more elaborate description of the object.
>
>
In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identified for the object while the title of an object is a more elaborate description of the object.
 
Changed:
<
<
Here are a few examples of mathematical concepts that correspond to various RooFit classes:
>
>
Here are a few examples of mathematical concepts that correspond to various RooFit classes:
 
Mathematical concept RooFit class
Changed:
<
<
variable RooRealVar
function RooAbsReal
PDF RooAbsPdf
space point (set of parameters) RooArgSet
list of space points RooAbsData
integral RooRealIntegral

---+++ __Example code: defining a RooFit variable__

>
>
variable RooRealVar
function RooAbsReal
PDF RooAbsPdf
space point (set of parameters) RooArgSet
list of space points RooAbsData
integral RooRealIntegral
 
Added:
>
>

Example code: defining a RooFit variable

 General form for defining a RooFit variable:
Changed:
<
<
RooRealVar x(, , , , )
>
>
RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
 Specific example for defining a RooFit variable x with the value 5: RooRealVar x("x", "x observable", 5, -10, 10)
Changed:
<
<

Example code: defining a RooFit variable

General form for defining a RooFit variable:
RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
Specific example for defining a RooFit variable x with the value 5:
RooRealVar x("x", "x observable", 5, -10, 10)
>
>

 
Changed:
<
<

Example code: create a Gaussian PDF using RooStats and plot it

{
>
>

Example code: create a Gaussian PDF using RooStats and plot it

{

  // Build a Gaussian PDF. RooRealVar x("x", "x", -10, 10); RooRealVar mean("mean", "mean of Gaussian", 0, -10, 10);
Line: 152 to 146
  gauss.plotOn(xframe); xframe->Draw(); }
Changed:
<
<
>
>

  Composite functions correspond to composite objects.
Changed:
<
<

The RooFit workspace

>
>

The RooFit workspace

 
Changed:
<
<
The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
>
>
The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
 
Changed:
<
<

RooStats

>
>

RooStats

 
Changed:
<
<
RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.
>
>
RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.
 
Changed:
<
<
The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.
>
>
The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.
 
Changed:
<
<

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

{
>
>

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

{

  // In this script, a simple model is created using the Workspace Factory in RooFit. // ModelConfig is used to specify the parts of the model necessary for the statistical tools of RooStats. // A 95% confidence interval test is run using the ProfileLikelihoodCalculator of RooStats.
Line: 228 to 223
  cout << "No" << endl; } }
Changed:
<
<
>
>

 

Further information

Changed:
<
<

RooFit links:

>
>

RooFit links:

  User's Manual

Tutorials

Changed:
<
<

RooStats links:

>
>

RooStats links:

  Wiki
Line: 251 to 246
  ROOT User's Guide
Added:
>
>
 -- WilliamBreadenMadden - 2010-10-29

Revision 52011-07-12 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2011-07-11
Line: 37 to 37
 Example script: building ROOT from source in Ubuntu

Changed:
<
<

Shell script: building ROOT with RooFit and RooStats

#!/bin/bash
 
# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.
 
# First, the ROOT prerequisites are installed,
# then, the most common ROOT optional packages are installed.
# Next, the latest version of ROOT in the CERN Subversion repository is checked out.
# Finally, ROOT is compiled.
 
# Install the ROOT prerequisites.
sudo apt-get install subversion
sudo apt-get install make
sudo apt-get install g++
sudo apt-get install gcc
sudo apt-get install binutils
sudo apt-get install libx11-dev
sudo apt-get install libxpm-dev
sudo apt-get install libxft-dev
sudo apt-get install libxext-dev
 
# Install the optional ROOT packages.
sudo apt-get install gfortran
sudo apt-get install ncurses-dev
sudo apt-get install libpcre3-dev
sudo apt-get install xlibmesa-glu-dev
sudo apt-get install libglew1.5-dev
sudo apt-get install libftgl-dev
sudo apt-get install libmysqlclient-dev
sudo apt-get install libfftw3-dev
sudo apt-get install cfitsio-dev
sudo apt-get install graphviz-dev
sudo apt-get install libavahi-compat-libdnssd-dev
sudo apt-get install libldap-dev
sudo apt-get install python-dev
sudo apt-get install libxml2-dev
sudo apt-get install libssl-dev
sudo apt-get install libgsl0-dev
 
# Check out latest ROOT trunk.
svn co http://root.cern.ch/svn/root/trunk ~/root
 
# The configuration for the build is set.
cd ~/root
# Run this to define the system architecture and to enable building of the libRooFit advanced fitting package:
./configure linuxx8664gcc --enable-roofit
# See other possible configurations using the following command: ./configure --help
# Start compiling.
make
# Upon completion, ROOT should be able to run by executing ~/root/bin/root.
 
# You might want to add the following line to the ~/.bashrc file:
# export PATH=$PATH:/home/wbm/root/bin
>
>

Shell script: building ROOT with RooFit and RooStats

#!/bin/bash

# This script builds the latest version of ROOT with !RooFit and !RooStats in Ubuntu.

# First, the ROOT prerequisites are installed,
# then, the most common ROOT optional packages are installed.
# Next, the latest version of ROOT in the CERN Subversion repository is checked out.
# Finally, ROOT is compiled.

# Install the  ROOT prerequisites.
   sudo apt-get install subversion
   sudo apt-get install make
   sudo apt-get install g++
   sudo apt-get install gcc
   sudo apt-get install binutils
   sudo apt-get install libx11-dev
   sudo apt-get install libxpm-dev
   sudo apt-get install libxft-dev
   sudo apt-get install libxext-dev

# Install the optional ROOT packages.
   sudo apt-get install gfortran
   sudo apt-get install ncurses-dev
   sudo apt-get install libpcre3-dev
   sudo apt-get install xlibmesa-glu-dev
   sudo apt-get install libglew1.5-dev
   sudo apt-get install libftgl-dev
   sudo apt-get install libmysqlclient-dev
   sudo apt-get install libfftw3-dev
   sudo apt-get install cfitsio-dev
   sudo apt-get install graphviz-dev
   sudo apt-get install libavahi-compat-libdnssd-dev
   sudo apt-get install libldap-dev
   sudo apt-get install python-dev
   sudo apt-get install libxml2-dev
   sudo apt-get install libssl-dev
   sudo apt-get install libgsl0-dev

# Check out latest ROOT trunk.
   svn co http://root.cern.ch/svn/root/trunk ~/root

# The configuration for the build is set.
   cd ~/root
   # Run this to define the system architecture and to enable building of the libRooFit advanced fitting package:
      ./configure linuxx8664gcc --enable-roofit
   # See other possible configurations using the following command: ./configure --help

# Start compiling.
   make

# Upon completion, ROOT is run by executing ~/root/bin/root.

# The following line could be added to the ~/.bashrc file:
#   export PATH=$PATH:/home/wbm/root/bin
 

Option 3: Build the RooStats branch.

Line: 117 to 119
 
list of space points RooAbsData
integral RooRealIntegral
Added:
>
>
---+++ __Example code: defining a RooFit variable__

General form for defining a RooFit variable: RooRealVar x(, , , , ) Specific example for defining a RooFit variable x with the value 5: RooRealVar x("x", "x observable", 5, -10, 10)

 

Example code: defining a RooFit variable

Changed:
<
<
General form for defining a RooFit variable:
RooRealVar x(<object name>, <object title>, <value>, <lower range>, <upper range>);
Specific example for defining a RooFit variable x with the value 5:
RooRealVar x("x", "x observable", 5, -10, 10);
>
>
General form for defining a RooFit variable:
RooRealVar x(<object name>, <object title>, <value>, <minimum value>, <maximum value>)
Specific example for defining a RooFit variable x with the value 5:
RooRealVar x("x", "x observable", 5, -10, 10)

Example code: create a Gaussian PDF using RooStats and plot it

{
   // Build a Gaussian PDF.
   RooRealVar x("x", "x", -10, 10);
   RooRealVar mean("mean", "mean of Gaussian", 0, -10, 10);
   RooRealVar sigma("sigma", "width of Gaussian", 3);
   
   RooGaussian gauss("gauss", "Gaussian PDF", x, mean, sigma);
   
   // Plot the PDF.
   RooPlot* xframe = x.frame();
   gauss.plotOn(xframe);
   xframe->Draw();
}
 

Composite functions correspond to composite objects.

Line: 138 to 167
  The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.
Added:
>
>

Example code: create a simple model using the RooFit Workspace Factory. Specify parts of the model using ModelConfig. Create a simple dataset. Complete a confidence interval test using the ProfileLikelihoodCalculator of RooStats

{
   // In this script, a simple model is created using the Workspace Factory in RooFit.
   // ModelConfig is used to specify the parts of the model necessary for the statistical tools of RooStats.
   // A 95% confidence interval test is run using the ProfileLikelihoodCalculator of RooStats.

   // Define a RooFit random seed in order to produce reproducible results.
   RooRandom::randomGenerator()->SetSeed(271);
   
   // Make a simple model using the Workspace Factory.
      // Create a new workspace.
      RooWorkspace* myWorkspace = new RooWorkspace();
      // Create the PDF G(x|mu,1) and the variables x, mu and sigma in one command using the factory syntax.
      myWorkspace->factory("Gaussian::normal(x[-10,10], mu[-1,1], sigma[1])");
      // Define parameter sets for observables and parameters of interest.
      myWorkspace->defineSet("poi","mu");
      myWorkspace->defineSet("obs","x");
   // Print the workspace contents.
      myWorkspace->Print() ;
   // Specify for the statistical tools the components of the defined model.
      // Create a new ModelConfig.
      ModelConfig* myModelConfig = new ModelConfig("my G(x|mu,1)");
      // Specify the workspace.
      myModelConfig->SetWorkspace(*myWorkspace);
      // Specify the PDF.
      myModelConfig->SetPdf(*myWorkspace->pdf("normal"));
      // Specify the parameters of interest.
      myModelConfig->SetParametersOfInterest(*myWorkspace->set("poi"));
      // Specify the observables.
      myModelConfig->SetObservables(*myWorkspace->set("obs"));
   // Create a toy dataset.
      // Create a toy dataset of 100 measurements of the observables (x).
      RooDataSet* myData = myWorkspace->pdf("normal")->generate(*myWorkspace->set("obs"), 100);
      //myData->print();
   // Use the ProfileLikelihoodCalculator to obtain a 95% confidence interval.
      // Specify the confidence level required.
      double confidenceLevel = 0.95;
      // Create an instance of the ProfileLikelihoodCalculator, specifying the data and the ModelConfig for it.
      ProfileLikelihoodCalculator myProfileLikelihoodCalculator(*myData, *myModelConfig);
      // Set the confidence level.
      myProfileLikelihoodCalculator.SetConfidenceLevel(confidenceLevel);
      // Obtain the resulting interval.
      LikelihoodInterval* myProfileLikelihoodInterval = myProfileLikelihoodCalculator.GetInterval();
      // Use this interval result. In this case, it makes sense to say what the lower and upper limits are.
         // Define the object variables for the purposes of the confidence interval.
         RooRealVar* x = myWorkspace->var("x");
         RooRealVar* mu = myWorkspace->var("mu");
      cout << "The profile likelihood calculator interval is [ "<<
         myProfileLikelihoodInterval->LowerLimit(*mu) << ", " <<
         myProfileLikelihoodInterval->UpperLimit(*mu) << "] " << endl;
      // Set mu equal to zero.
      mu->setVal(0);
      // Is mu in the interval?
      cout << "Is mu = 0 in the interval?" << endl;
      if (myProfileLikelihoodInterval->IsInInterval(*mu) == 1){
         cout << "Yes" << endl;
      } else{
         cout << "No" << endl;
      }
}
 

Further information

RooFit links:

Revision 42011-07-11 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<
-- WilliamBreadenMadden - 2010-10-21
>
>
-- WilliamBreadenMadden - 2011-07-11
 
Changed:
<
<

Higgs analysis at ATLAS using RooStats

>
>

Higgs analysis at ATLAS using RooStats

 
Changed:
<
<
This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats. The information will be updated in due course.
>
>
This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats.
 
Changed:
<
<

What is RooStats?

>
>
 
Changed:
<
<
RooStats is a project to create statistical tools built on top of the RooFit library (a data modeling toolkit). It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest release of ROOT is dev. version 5.27/06. To use RooStats, a version of ROOT greater than 5.22 is required. The latest version of ROOT is recommended as the RooStats project is developing quickly.
>
>

A note on formatting

 
Changed:
<
<

How do I use the appropriate version of ROOT at Glasgow?

>
>
Example code is generally indented and given highlighting. Explanatory code and text directly relating to it has yellow highlighting while example code that one might run directly in ROOT is given a green-text-on-black-background console style. Scripts are given grey highlighting.
 
Changed:
<
<
There are instructions on how to use the different versions of ROOT at Glasgow here.

How do I personally get RooStats?

>
>

What is RooStats?

 
Changed:
<
<
There are three options:
>
>
RooStats is a project to create statistical tools built on top of the RooFit library, which is a data modelling toolkit. It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest version of ROOT is recommended as the RooStats project is developing quickly.
 
Changed:
<
<
Option 1: Download the latest ROOT release.
>
>

Using the appropriate version of ROOT at Glasgow

 
Changed:
<
<
Do this if you want to install ROOT from binaries.
>
>
There are instructions on how to use the different versions of ROOT at Glasgow here.
 
Changed:
<
<
http://root.cern.ch/drupal/content/downloading-root
>
>

Personally acquiring RooStats

 
Changed:
<
<
Latest release of ROOT, dev. version 5.27/06 for Scientific Linux:
>
>
There are three main options available for acquiring ROOT with RooStats included.
 
Changed:
<
<
ftp://root.cern.ch/root/root_v5.27.06.Linux-slc5-gcc4.3.tar.gz
>
>

Option 1: Download the latest ROOT release binaries.

 
Changed:
<
<
Option 2: Build the ROOT trunk.
>
>
The latest ROOT binaries for various operating systems are accessible here.
 
Changed:
<
<
Do this if you want to build ROOT from source.
>
>

Option 2: Build the ROOT trunk from source.

  ftp://root.cern.ch/root/root_v5.27.06.source.tar.gz
Changed:
<
<
Follow the appropriate instructions to build it:
>
>
Follow the appropriate instructions here to build it.

Example script: building ROOT from source in Ubuntu

 
Changed:
<
<
http://root.cern.ch/drupal/content/installing-root-source
>
>

Shell script: building ROOT with RooFit and RooStats

#!/bin/bash
 
# This script builds the latest version of ROOT with RooFit and RooStats in Ubuntu.
 
# First, the ROOT prerequisites are installed,
# then, the most common ROOT optional packages are installed.
# Next, the latest version of ROOT in the CERN Subversion repository is checked out.
# Finally, ROOT is compiled.
 
# Install the ROOT prerequisites.
sudo apt-get install subversion
sudo apt-get install make
sudo apt-get install g++
sudo apt-get install gcc
sudo apt-get install binutils
sudo apt-get install libx11-dev
sudo apt-get install libxpm-dev
sudo apt-get install libxft-dev
sudo apt-get install libxext-dev
 
# Install the optional ROOT packages.
sudo apt-get install gfortran
sudo apt-get install ncurses-dev
sudo apt-get install libpcre3-dev
sudo apt-get install xlibmesa-glu-dev
sudo apt-get install libglew1.5-dev
sudo apt-get install libftgl-dev
sudo apt-get install libmysqlclient-dev
sudo apt-get install libfftw3-dev
sudo apt-get install cfitsio-dev
sudo apt-get install graphviz-dev
sudo apt-get install libavahi-compat-libdnssd-dev
sudo apt-get install libldap-dev
sudo apt-get install python-dev
sudo apt-get install libxml2-dev
sudo apt-get install libssl-dev
sudo apt-get install libgsl0-dev
 
# Check out latest ROOT trunk.
svn co http://root.cern.ch/svn/root/trunk ~/root
 
# The configuration for the build is set.
cd ~/root
# Run this to define the system architecture and to enable building of the libRooFit advanced fitting package:
./configure linuxx8664gcc --enable-roofit
# See other possible configurations using the following command: ./configure --help
# Start compiling.
make
# Upon completion, ROOT should be able to run by executing ~/root/bin/root.
 
# You might want to add the following line to the ~/.bashrc file:
# export PATH=$PATH:/home/wbm/root/bin

Option 3: Build the RooStats branch.

Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).

The necessary instructions can be found here.

RooFit

The RooFit library provides a toolkit for modelling the expected distribution of events in a physics analysis. Models can be used to perform unbinned maximum likelihood fits, produce plots and generate "toy Monte Carlo" samples for various studies.

The core functionality of RooFit is to enable the modelling of 'event data' distributions, where each event is a discrete occurrence in time, and has one or more measured observables associated with it. Experiments of this nature result in datasets obeying Poisson (or binomial) statistics. The natural modeling language for such distributions are probability density functions F(x;p) that describe the probability density the distribution of observables x in terms of function in parameter p.

In RooFit, every variable, data point, function and PDF is represented in a C++ object. So, for example, in constructing a RooFit model, the mathematical components of the model map to separate C++ objects. Objects are classified by the data or function type that they represent, not by their respective role in a particular setup. All objects are self-documenting. The name of an object is a unique identified for the object while the title of an object is a more elaborate description of the object.

Here are a few examples of mathematical concepts that correspond to various RooFit classes:

Mathematical concept RooFit class
variable RooRealVar
function RooAbsReal
PDF RooAbsPdf
space point (set of parameters) RooArgSet
list of space points RooAbsData
integral RooRealIntegral

Example code: defining a RooFit variable

General form for defining a RooFit variable:
RooRealVar x(<object name>, <object title>, <value>, <lower range>, <upper range>);
Specific example for defining a RooFit variable x with the value 5:
RooRealVar x("x", "x observable", 5, -10, 10);

Composite functions correspond to composite objects.

The RooFit workspace

 
Changed:
<
<
Option 3: Build the RooStats branch.
>
>
The RooFit "workspace" provides the ability to store the full likelihood model, any desired priors and the minimal data necessary to reproduce the likelihood function in a ROOT file. Thus, the workspace is needed for combinations and has potential for digitally publishing results (PhyStats agreed to publish likelihood functions). The RooFit workspace can be used for this.
 
Changed:
<
<
Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).
>
>

RooStats

 
Changed:
<
<
Follow these instructions:
>
>
RooStats provides tools for high-level statistics questions in ROOT. It builds on RooFit, which provides basic building blocks for statistical questions.
 
Changed:
<
<
https://twiki.cern.ch/twiki/bin/view/RooStats/DownloadAndInstallTheRooStatsBranch
>
>
The main goal is to standardise the interface for major statistical procedures so that they can work on an arbitrary RooFit model and dataset and handle any parameters of interest and nuisance parameters. Another goal is to implement most accepted techniques from frequentist, Bayesian and likelihood based approaches. A further goal is to provide utilities to perform combined measurements.
 

Further information

Changed:
<
<
RooFit links:
>
>

RooFit links:

  User's Manual

Tutorials

Changed:
<
<
RooStats links:
>
>

RooStats links:

  Wiki
Line: 62 to 154
  Tutorials
Changed:
<
<
ROOT links:
>
>

ROOT links:

  ROOT User's Guide

Revision 32010-10-29 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamBreadenMadden - 2010-10-21
Line: 10 to 10
  RooStats is a project to create statistical tools built on top of the RooFit library (a data modeling toolkit). It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest release of ROOT is dev. version 5.27/06. To use RooStats, a version of ROOT greater than 5.22 is required. The latest version of ROOT is recommended as the RooStats project is developing quickly.
Changed:
<
<

How do I get RooStats?

>
>

How do I use the appropriate version of ROOT at Glasgow?

There are instructions on how to use the different versions of ROOT at Glasgow here.

How do I personally get RooStats?

  There are three options:
Line: 62 to 66
  ROOT User's Guide
Deleted:
<
<
-- WilliamBreadenMadden - 2010-10-22
 \ No newline at end of file
Added:
>
>
-- WilliamBreadenMadden - 2010-10-29
 \ No newline at end of file

Revision 22010-10-22 - WilliamBreadenMadden

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Added:
>
>
-- WilliamBreadenMadden - 2010-10-21
 

Higgs analysis at ATLAS using RooStats

This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats. The information will be updated in due course.

Line: 59 to 61
 
ROOT links:

ROOT User's Guide

Added:
>
>
-- WilliamBreadenMadden - 2010-10-22
 \ No newline at end of file

Revision 12010-10-21 - WilliamBreadenMadden

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

Higgs analysis at ATLAS using RooStats

This page contains basic information on getting started with Higgs analysis at ATLAS using RooStats. The information will be updated in due course.

What is RooStats?

RooStats is a project to create statistical tools built on top of the RooFit library (a data modeling toolkit). It is distributed in ROOT. Specifically, it has been distributed in the ROOT release since version 5.22 (December 2008). The latest release of ROOT is dev. version 5.27/06. To use RooStats, a version of ROOT greater than 5.22 is required. The latest version of ROOT is recommended as the RooStats project is developing quickly.

How do I get RooStats?

There are three options:

Option 1: Download the latest ROOT release.

Do this if you want to install ROOT from binaries.

http://root.cern.ch/drupal/content/downloading-root

Latest release of ROOT, dev. version 5.27/06 for Scientific Linux:

ftp://root.cern.ch/root/root_v5.27.06.Linux-slc5-gcc4.3.tar.gz

Option 2: Build the ROOT trunk.

Do this if you want to build ROOT from source.

ftp://root.cern.ch/root/root_v5.27.06.source.tar.gz

Follow the appropriate instructions to build it:

http://root.cern.ch/drupal/content/installing-root-source

Option 3: Build the RooStats branch.

Do this if you want the latest development of RooStats (that has not yet been incorporated into a ROOT version).

Follow these instructions:

https://twiki.cern.ch/twiki/bin/view/RooStats/DownloadAndInstallTheRooStatsBranch

Further information

RooFit links:

User's Manual

Tutorials

RooStats links:

Wiki

RooStats User's Guide (draft)

Tutorials

ROOT links:

ROOT User's Guide

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback