INFN-CNAF Tier-1 user guide
Summary

  1. CNAF

  2. Tier-1

  3. Bastion & user interfaces

    1. x2go

  4. Farming

  5. Storage

  6. The HPC cluster

    1. Account Request

      1. Access

    2. SLURM architecture

      1. Check the cluster status with SLURM

    3. The structure of a basic batch job

      1. Submit basic instructions on Slurm with srun

      2. #SBATCH options

      3. Advanced batch job configuration

      4. Retrieve job information

      5. Killing a submitted job

    4. Submission examples

      1. Simple batch submit

      2. Simple MPI submit

      3. Simple GPU submit
      4. Simple Python submit
      5. Python submit with a virtual environment
    5. Migrating from LSF
    6. Environment Modules
  7. Cloud @ CNAF
  8. Digital Personal Certificates and Proxies management

    1. Manual proxy extension

  9. Job submission

    1. HTCondor jobs

      1. Submit local jobs

      2. Submit Grid jobs

      3. Experiment share usage

    2. Examples

      1.  Multiple job submission
      2. CPUs, GPUs and RAM requests

      3. Jobs with ROOT-program as executable

    3. Singularity in batch jobs

      1. Obtain images

      2. Create a new image using a recipe (expert users)

      3. Run software

    4. Jupyter notebook in interactive batch jobs
      1. File persistency and quota
      2. User environment customization
      3. Conda environment creation
      4. Software installation in a Conda environment 
    5. DAG Jobs
      1. Example
  10. Data Transfers

    1. Data transfers without SRM

    2. Data transfers with SRM

      1. Gfal utils

      2. ClientSRM utils
    3. XrootD (extended ROOT deamon)

    4. Data transfers using http endpoints

      1. Proxies
        1. Third-party-copies
      2. Tokens
        1. Curl examples 
        2. Data transfers inside a job
    5. Tape

      1. Check if the file is on the disk (using local POSIX commands)

      2. Check if the file is on the disk (with Grid tools using VO based authentication)

      3. Migrate files on tape

      4. Recall files from tape (using Grid tools with VO-based authentication)

      5. Recall files from tape (without Grid tools)

    6. StoRM Tape REST API
      1. Check if a file is on disk/tape (archiveinfo)
      2. Recall files from tape (stage request)
      3. Delete a stage request
      4. Release a file
  11. Monitoring

    1. Monitoring with Grafana

  12. Helpful information and tips

    1. How to use Python libraries in a conda virtual environment
      1. On a user interface
      2. In a HTCondor job
    2. Other tips
    3. How to import users from a VOMS server to IAM (expert users)
  13. Support
  14. Problem report
  15. Appendix A - Submit Description File Commands
  16. Appendix B - Helpful links
  17. Bibliography