Page History

...

import pickle

# Save to file in the current working directory
pkl_filename = "rf_model.pkl"
with open(pkl_filename, 'wb') as file:
    pickle.dump(rfc, file)

Performance evaluation

In this section you will find the following subsections:

ROC curve and Rates definitions
Overfitting and test evaluation of a an MVA model
If you have the knowledge about these theoretical concepts you may skip it.
Artificial Neural Network performance
Exercise 1 - Random Forest performance
Here you will re-do the procedure followed for the ANN in order to evaluate the Random Forest performance.
Finally, you will compare the discriminating performance of the two trained ML models.

...

The recall/sensitivity/TPR/signal efficiency is the ratio $\frac{TP}{TP + FN}$ where TP is the number of true positives and FN the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The accuracy Accuracy is defined as the number of good matches between the predictions and the true labels.

...

#Let's import all the metrics that we need later on!
from sklearn.metrics import ConfusionMatrixDisplay,confusion_matrix,accuracy_score , precision_score , recall_score , precision_recall_curve , roc_curve, auc , roc_auc_score

Overfitting and test evaluation of

...

an MVA model

The loss function and the accuracy metrics give us a measure of the overtraining (overfitting) of the ML algorithm. Over-fitting happens when a an ML algorithm learns to recognize a pattern that is primarily based on the training (validation) sample and that is nonexistent when looking at the testing (training) set (see the plot on the right side to understand what we would expect when overfitting happens).

...

Space shortcuts

Page tree

Versions Compared

Old Version 67

New Version 68

Key

Performance evaluation

Overfitting and test evaluation of

an MVA model