Table of Contents

Author(s)

Name	Institution	Mail Address	Social Contacts
Luca Giommi	INFN-CNAF	luca.giommi@cnaf.infn.it	N/A
Mattia Paladino	University of Bologna	mattia.paladino2@unibo.it	N/A

How to Obtain Support

Mail	luca.giommi@cnaf.infn.it
Social	N/A
Jira	N/A

General Information

ML/DL Technologies	classification algorithms
Science Fields	High energy Physics
Difficulty	low
Language	English
Type	runnable, fully annotated

Software and Tools

Programming Language	Python
ML Toolset	Keras, Tensorflow, sklearn, PyTorch, XGBoost
Additional libraries	uproot, matplotlib,
Suggested Environments	Google CoLab, Docker, own PC, INFN-Cloud VM

Needed datasets

Data Creator	ATLAS experiment
Data Type	simulation
Data Size	57 MB compressed
Data Source	Kaggle, CERN opendata

Short Description of the Use Case

In this exercise, we use the MLaa4HEP machinery to deal with the Higgs boson ML challenge, a competition held in 2014, organized by a group of ATLAS physicists and data scientists, and hosted by the Kaggle platform.

...

files.txt stores the path of the input ROOT files;
labels.txt stores the labels of the input ROOT files in case of classification problems;
model.py stores the definition of the custom ML model to use in the training phase, in the user’s favorite ML framework;
params.json stores the parameters on which MLaaS4HEP is based, e.g. number of events to use, chunk size, batch size, and redirector path for files located in remote storage;
preproc.json stores the definition of preprocessing operations to be applied to data.

How to execute it

Way #1: Use Googe Colab

You can run this Jupyter notebook using Google Colab, by clicking here. It covers several steps, from inspecting data, to running the MLaaS4HEP framework to obtain the trained ML models, to uploading the submissions file to the Kaggle website.

Way #2 Use the MLaaS4HEP Docker image

If you don't want to use MLaaS4HEP in the Google Colab notebook but you want to use your resources, instead of installing all the dependencies you can use the MLaaS4HEP Docker image, i.e. felixfelicislp/mlaas:xrootd_pip. An example of the command to run is the following:

docker run --name={name} --memory={memory} --cpus={cpus} felixfelicislp/mlaas:xrootd_pip --files={files} --labels={labels} --model={model} --params={params} --fout={fout}

Way #3: Use the MLaaS4HEP server

Another way to use the MLaaS4HEP framework is to interact with the APIs of the MLaaS4HEP server. We implemented a working prototype connecting an OAuth2-Proxy server, a MLaaS4HEP_server, an xrootd proxy-cache server, an X509 proxy renewer, and TFaaS, hosted by a VM of the INFN Cloud.

...

The former command allows training a ML model, whereas the latter allows using this model to get the prediction on a given event (stored in the predict_bkg.json file). All the instructions about how to use these services can be found here. A demo version of the services can be found here. A pictorial representation of the services is the following:

References

V. Kuznetsov, L. Giommi, D. Bonacorsi, MLaaS4HEP: Machine Learning as a Service for HEP. Comput Softw Big Sci 5, 17 (2021). DOI: 10.1007/s41781-021-00061-3

...

Space shortcuts

Page tree

Versions Compared

Old Version 7

New Version 8

Key

Author(s)

How to Obtain Support

General Information

Software and Tools

Needed datasets

Short Description of the Use Case

How to execute it

Way #1: Use Googe Colab

Way #2 Use the MLaaS4HEP Docker image

Way #3: Use the MLaaS4HEP server

References

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 7

New Version 8

Key

Author(s)

How to Obtain Support

General Information

Software and Tools

Needed datasets

Short Description of the Use Case

How to execute it

Way #1: Use Googe Colab

Way #2 Use the MLaaS4HEP Docker image

Way #3: Use the MLaaS4HEP server

References