Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Here, it is provided to share teaching resources among ML practitioners within INFN.

The Standard Model and the Higgs Boson

The Standard Model (SM) is a set of theoretical models that encode the consolidated knowledge on elementary particles and their interactions. According the the SM, elementary particles are quanta of the excitation of fields (naively, a function of space and time), the properties of the fields are reflected into properties of the particles. 

...

The quantum field theory describing so successfully the interactions between particles, cannot fit massive quanta of the interaction fields (named bosons for short) while experimentally the W and Z bosons are found to be massive. The problem was solved by Brout Englert and Higgs introducing an additional field with which the W and Z boson interact gaining rest energy (which is another name for the mass) in a symmetry breaking mechanism, named BEH mechanisms after the three theorists. If such a field exists, then it would be natural to describe the mass of all particles, including matter particles, as due to interactions with the BEH field rather than with additional free parameters in the model. The interaction of the BEH field with fermions is described by the Yukawa couplings. The new field must be associated to an excitation quantum, named Higgs boson. Discovering such a predicted particle was for long considered as the smoking-gun evidence of the correctness of the SM and motivated a huge effort in the development and construction of the Large Hadron Collider, at CERN, culminated in 2012 with the discovery of the Higgs boson analysing the data of the two major experiments named ATLAS and CMS.

The Large Hadron Collider and the CMS experiment at CERN

The Large Hadron Collider (LHC) is the largest particle accelerator currently active in the world. Locate at an average depth of 100 meters underground at the CERN facilities in Geneva, it spans a 27 km circumference crossing the Swiss-French border. It is capable of accelerating beams of protons or heavy ions in two opposite directions, producing proton-proton, ion-ion, and proton-ion collisions. LHC accelerates the proton beams up to maximum energy of 7 TeV. 

...

  • the radius, intended as the distance from the LHC beam axis
  • the pseudorapidity, which is a monotonic transformation of the polar angle theta [see Wikipedia]
  • the azimuthal angle phi 


Production of the Higgs boson

When two protons accelerated by the LHC collide, their constituent quarks and gluons have a chance to interact. Because of the internal quantum structure of the proton, its constituents "carry" different fractions of the total energy of the proton. Indeed, in contrast with what happens classically, the number of constituents within the proton is not constant and the three "valence quark" (two up quarks and a down quark) interact via the strong interacting emitting gluons and creating from vacuum pairs of quark-antiquarks (named sea-quark) that all travel at nearly the speed of light within the accelerated protons. It is impossible to predict, at the time of the collision, what is the exact share of energy of the interacting quarks or gluons, but one can define a distribution of the probability of finding a given constituent of the proton with a given fraction of its total energy. 

...

This property of the collisions producing heavy particles is probably the most discriminant feature that allows to identify the collisions of interest: for example, most of the trigger algorithms designed to run in real time discarding the vast majority of collision events where it is unlikely to identify Higgs bosons, require at least one observed particle with a large transverse momentum (the component of the momentum orthogonal to the beam axis). 

Decays of the Higgs boson

Given that the Higgs boson couples to every massive particle in the SM, its phenomenology is particularly rich. The higher the mass of the particle, the higher the coupling of the Higgs boson with the particle. This reasoning may help understanding naively why the decay of the Higgs to two massive bosons (ZZ) or (WW) is expected to be one of the most abundant decay modes of the Higgs boson produced in proton-proton collisions at the LHC. 

...

Conversely, if the fermion-antifermion pair produced in the W decay are leptons, then one must be a neutrino which is not detectable from CMS and escapes while subtracting the overall energy of the collision event a substantial fraction, possibly breaking the cylindrical symmetry of the system. The other lepton is instead detected very effectively by the CMS detector because its energy is very high so that it is difficult to confuse it with a lepton produced in other processes. 

The dataset

This set of exercises is based on a simulated dataset of Higgs decays to two W bosons, both decaying to a charged lepton and a neutrino. In formula,

...

This is the starting point for most statistical analyses in High Energy Physics: one has to develop an algorithm to classify signal and background events using simulation. That algorithm will then be run on real data and the resulting selected events have to be statistically subtracted for the expected contribution from background events in order to count the number of signal events and use them to infer physical properties of the decay. 

Getting started (https://colab.research.google.com/drive/1C0-zM3tRRGhrL7XbqolmlUTtrwKaunhQ?usp=sharing)

In this first notebook we discuss the dataset, learn how to plot histograms and use them to define a 1D selection strategy comparing the discriminant power of different variables. 

...

Software-side, we used numpy and matplotlib, only, with explicit implementation of some of the most common operations with dedicated libraries (drawing the ROC curve). 

Linear classification in two dimensions (https://colab.research.google.com/drive/1LAaNkpILnPiQL2fhlCYuTeTTVyy1zrP7?usp=sharing)

In this notebook we focus on the multivariate selection based on linear discrimants. We discuss the Fisher Discriminant and the Logistic Regression. 

...

Software-side, we introduce scikit-learn, after having discussed the implementation of the various algorithm with pure numpy.

Decision Trees and Forests (https://colab.research.google.com/drive/18NHWmORaYDtpQq74mKJwTb97UZ1R2w0w?usp=sharing)

We move on to discuss decision trees, ensembles of random decision trees to reduce the variance of the method, and then consider two widely adopted boosting algorithms: AdaBoost and Gradient Boosting.

...

Software-side, we play with scikit-learn. 

Neural Networks (https://colab.research.google.com/drive/1pKKELV34Hiori7_19Ukbbm6Olw_FxxHz?usp=sharing)

Finally, we extend the Logistic Regression discussed in the Linear-classification notebook modifying the representation through a neural network implemented in plain tensorflow 2, and then in keras. 

...

Software-side, this notebook is significantly more demanding than the previous ones. 

Conclusions

We developed four Colab notebooks to introduce students to the problem of classification taking as an example the signal-background separation in High Energy Physics, and in particular in CMS for a hot topic such as the study of the Higgs boson. 

...