A translation of the RooFit Users Manual seen on January 2016 into Python using ROOT 6.06/00 with enable-roofit.
Tags should include. pyROOT, python, RooFit, RooStats, iPython, jupyter, ROOT and more. This page specifically features, working with TTrees, numpy, RooDataHist, RooRealVar and I add these since the links that I found when googling these links prooved invaluable.
import ROOT
rootnotes is included to help facilitate in line plotting. The following lines (including the rootnotes hint was taken from Kyle Cranmer's website
import rootnotes
c1=rootnotes.default_canvas()
w = ROOT.RooWorkspace()
w.factory('Gaussian::g(x[-5,5],mu[-3,3],sigma[1])')
w.factory('Exponential::e(x,tau[-.5,-3,0])')
w.factory('SUM::model(s[50,0,100]*g,b[100,0,1000]*e)')
w.Print()
x = w.var('x')
pdf = w.pdf('model')
frame = x.frame()
data = pdf.generate(ROOT.RooArgSet(x))
data.plotOn(frame)
fitResult = pdf.fitTo(data,ROOT.RooFit.Save(),ROOT.RooFit.PrintLevel(-1))
pdf.plotOn(frame)
frame.Draw()
c1
A key concept is mathematical correspondances with objects.
x = ROOT.RooRealVar("x","x",-10,10)
mean = ROOT.RooRealVar("mean","Mean of Gaussian",-10,10)
sigma = ROOT.RooRealVar("sigma","Width of Gaussian",3,-10,10)
gauss = ROOT.RooGaussian("gauss","gauss(x,mean,sigma)",x,mean,sigma)
Models are built from variables and pdfs. The mathematical coherance is key.
In statistics the angle at which you view your data is important. Roostats uses frames to do this. First a frame is created then a model is fitted onto the frame. This frame (with model) is then drawn on a ROOT TCanvas.
xframe = x.frame()
gauss.plotOn(xframe)
xframe.Draw()
c1
A frame contains a snapshot of the item as soon as it is plotted onto it. It can contain different plots of the same distribution.
newframe = x.frame()
gauss.plotOn(newframe)
sigma.setVal(2)
gauss.plotOn(newframe,ROOT.RooFit.LineColor(2))
newframe.Draw()
c1
Using notebooks makes it difficult to plot several frames, since the notebook must be rendered at the correct order.
Interestingly, real values must be set using the setVal() function not simply to a float as in C++
Data usually come either binned or unbinned.
h1 = ROOT.TH1D("h1","gaussian histogram",20,-10,10)
h1.FillRandom("gaus",10000)
h2 = ROOT.TH1D("h2","gaussian histogram",20,-10,10)
for i in range(10000): h2.Fill(ROOT.gRandom.Gaus(0,3))
x = ROOT.RooRealVar("x","x",-10,10)
l = ROOT.RooArgList(x)
data = ROOT.RooDataHist("data", "data set with x1", l, h1)
data2 = ROOT.RooDataHist("data2", "data set with x2", l, h2)
For some reason, the RooDataHist wont take the RooRealVar as an argument. As such we add it to the RooArgList.
xframe = x.frame()
data.plotOn(xframe)
data2.plotOn(xframe,ROOT.RooFit.MarkerColor(2))
xframe.Draw()
h1.Draw("same")
c1
import numpy as np
tree = ROOT.TTree("tree","tree")
x = np.zeros(1,dtype=float)
tree.Branch("x",x,'x/D')
for i in range(10000):
x[0] = np.random.normal(0,3,1)
tree.Fill()
x = ROOT.RooRealVar("x","x",-10,10)
data = ROOT.RooDataSet("data","dataset from tree",tree,ROOT.RooArgSet(x))
As can be seen the benefits of using python mean that we can interface ROOT and numpy. Both of which is And now to plot!
xframe = x.frame()
data.plotOn(xframe,ROOT.RooFit.Binning(25))
xframe.Draw()
c1
Generally RooFit can use binned and unbinned data interchangeably since they both (RooDataSet and RooDataHist) inherit from the same class.
Usually in data analysis we want to test how well the data conforms to our hypothesis, in real terms this means comparing the results of the two and testing how similar they look. In order to do this we must create a test statistic. the most common choices are $\chi^2 \text{ and }-\log(likelihood)$
The default fit in RooFit is the maximum likelihood fit that matches the data's binned or unbinned status. Since RooFit is a ROOT implementation the minimisation is done by MINUIT through is TMinuit implementation.
The easiest fit is performed by the fitTo() method of class RooAbsPdf. Which builds a -log(L) function from the gauss function and the given dataset, then MINUIT minimizes it and estimates the errors.
x = ROOT.RooRealVar("x","x",-10,10)
mean = ROOT.RooRealVar("mean","Mean of Gaussian",0,-10,10)
sigma = ROOT.RooRealVar("sigma","Width of Gaussian",3,-10,10)
gauss = ROOT.RooGaussian("gauss","gauss(x,mean,sigma)",x,mean,sigma)
data = gauss.generate(ROOT.RooArgSet(x),10000)
xframe = x.frame()
data.plotOn(xframe, ROOT.RooLinkedList())
gauss.plotOn(xframe)
xframe.Draw()
c1
In this example the generate function is used to generate a data set from the function provided.
result = gauss.fitTo(data,ROOT.RooFit.PrintLevel(-1))
mean.Print()
sigma.Print()
These can be set to constant
mean.setConstant(ROOT.kTRUE)
sigma.setConstant(ROOT.kFALSE)
And the ranges modified.
sigma.setRange(0.1,3)
result = gauss.fitTo(data,ROOT.RooFit.Minos(ROOT.kTRUE),ROOT.RooFit.PrintLevel(-1))
gauss.fitTo(data,ROOT.RooFit.Range(-5,5),ROOT.RooFit.PrintLevel(-1))
xframe = x.frame()
data.plotOn(xframe,ROOT.RooLinkedList())
gauss.plotOn(xframe)
xframe.Draw()
c1
intGaussX = gauss.createIntegral(ROOT.RooArgSet(x))
intGaussX.getVal()
Since intGaussX is a RooAbsReal
cdf = gauss.createCdf(ROOT.RooArgSet(x))
xframe = x.frame()
cdf.plotOn(xframe)
xframe.Draw()
c1