Page History

...

Training (learning): a discriminator is built by using all the input variables. Then, the parameters are iteratively modified by comparing the discriminant output to the true label of the dataset (supervised machine learning algorithms, we will use two of them). This phase is crucial: one should tune the input variables and the parameters of the algorithm!
- As an alternative, algorithms that group and find patterns in the data according to the observed distribution of the input data are called unsupervised learning.
- A good habit is training multiple models with various hyperparameters on a “reduced” training set ( i.e. the full training set minus subtracting the so-called validation set), and then select the model that performs best on the validation set.
- Once, the validation process is over, you can re-train the best model on the full training set (including the validation set), and this gives you the final model.
Test: once the training has been performed, the discriminator score is computed in a separated, independent dataset for both H₀ and H₁ .
A comparison is made between test and training classifier and their performances (in terms of ROC curves) are evaluated.
- If the test fails and the performance of the test and training are different, this could be a symptom of overtraining and our model can be considered not good!

...

Such a structure is also called Feedforward Multilayer Perceptron (MLP, see the picture).

The output of the $k_{th}$ Image Removed node k_th node of the $nth$ Image Removed layers n_th layers is computed as the weighted average of the input variables, with weights that are subject to optimization via training.

...

Then a bias or threshold parameter $w_{0}$ Image Removed is w₀ is applied. This bias accounts for the random noise, in the sense that it measures how well the model fits the training set (i.e. how much the model is able to correctly predict the known outputs of the training examples.) The output of a given node is: $y^{(n)}_{k}(\vec{x})=\phi (w^{n}_{0}\sum_{j=1}^{p^{(n)}}w^{(n)}_{kj}x_{j})$ .

...

During training we optimize the loss function, i.e. reduce the error between actual and predicted values. Since we deal with a binary classification problem, the $y_{true}$ Image Removed can y_true can take on just two values, $y_{true} =0$ Image Removed y_true = 0 (for hypothesis $H_{0}$ Image RemovedH₁) and = 1 (for hypothesis $H_{1}$ Image RemovedH₀).

A popular algorithm to optimize the weights consists of iteratively modifying the weights after each training observation or after a bunch of training observations by doing a minimization of the loss function.

...

Question to students: What happens if you switch to the $4e$ Image Removed decay 4e decay channel? You can submit your model (see the ML challenge below) for this physical process as well!

...

You can participate as a single participant or as a team
The winner is the one scoring the best AUC in the challenge samples!
In the next box, you will find some lines of code for preparing an output csv file, containing your y_predic for this new dataset!
Choose a meaningful name for your result csv file (i.e. your name, or your team name, the model used for the training phase, and the decay channel - 4 $\mu$ Image Removed or 4 $e$ Image Removed 4μ or 4e - but avoid to submit results.csv)
Download the csv file and upload it here: https://recascloud.ba.infn.it/index.php/s/CnoZuNrlr3x7uPI
You can submit multiple results, paying attention to name them accordingly (add the version number, such as v1, v34, etc.)
You can use this exercise as a starting point (train over constituents)
We will consider your best result for the final score.
The winner will be asked to present the ML architecture!

...

Space shortcuts

Page tree

Versions Compared

Old Version 133

New Version Current

Key