Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To ease the transition to the new cluster and the general use of HTCondor, we implemented a solution based on environment modules. The traditional interaction methods, i.e. specifying all command line options, remain valid, yet less handy and more verbose.

The htc modules . It will set all environment variables needed to correctly submit to both the old and the new clusterHTCondor clusters.
Once logged into any UI Tier 1 user interface, this utility will be available. You can list all the available modules using:

...

  • htc/local - to be used once you want to submit /jobs to or query the local schedds sn-02 or sn01-htc (HTCondor23 schedd), supports variables specification:
    , respectively the HTCondor 9.0 and 23 cluster access points. This is the default module loaded when loading the "htc" family.
    variablevariablevaluesdescription
    ver9connects to selects the old HTCondor cluster and local schedd (sn-02)
    23connects to selects the new HTCondor cluster and local schedd (sn01-htc)

    Code Block
    languagebash
    themeMidnight
    titleUsage of htc/local module
    apascolinit1@ui-tier1 ~
    $ module switch htc/local ver=9
    apascolinit1@ui-tier1 ~
    $ condor_q
    
    
    -- Schedd: sn-02.cr.cnaf.infn.it : <131.154.192.42:9618?... @ 04/17/24 14:58:44
    OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS
    
    Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
    Total for apascolinit1: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
    Total for all users: 50164 jobs; 30960 completed, 1 removed, 12716 idle, 4514 running, 1973 held, 0 suspended
    
    apascolinit1@ui-tier1 ~
    $ module switch htc/local ver=23
    apascolinit1@ui-tier1 ~
    $ condor_q
    
    
    -- Schedd: sn01-htc.cr.cnaf.infn.it : <131.154.192.242:9618?... @ 04/17/24 14:58:52
    OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS
    
    Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
    Total for apascolinit1: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
    Total for all users: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspende


  • htc/auth - helps to setup authentication methods for Grid submissionce - eases the usage of condor_q and condor_submit commands setting up all the needed variables to contact our Grid compute entryopoints.

    auth

    variablevaluesdescriptionGSI

    sets up GSI authentication (old cluster only)

    SSL

    sets up SSL authentication (new cluster only)
    num1,2,3,4connects to ce{num}-htc (new cluster)
    5,6,7connects to ce{num}-htc (old cluster)
    authGSI,SSL,SCITOKENScalls htc/auth with the selected auth method

    SCITOKENS

    sets up SCITOKENS authentication

    Code Block
    languagebash
    themeMidnight
    titleUsage of htc/auth ce module
    apascolinit1@ui-tier1 ~
    $ module switch htc/auth auth=SSL
    Don't forget to voms-proxy-init!condor_q
    Error:
    ......
    
    apascolinit1@ui-tier1 ~
    $ module switch htc/authce auth=SCITOKENS num=2
    Don't forget to "export BEARER_TOKEN=$(oidc-token <client-name>)"!
    htc/ce - eases the usage of condor_q and condor_submit commands setting up all the needed variables to contact our CEs
    variablevaluesdescriptionnum1,2,3,4connects to ce{num}-htc (new cluster)5,6,7connects to ce{num}-htc (old cluster)authGSI,SSL,SCITOKENScalls htc/auth with the selected auth method
    Code Block
    languagebash
    themeMidnight
    titleUsage of htc/ce module
    
    
    Switching from htc/ce{auth=SCITOKENS:num=2} to htc/ce{auth=SCITOKENS:num=2}
    Loading requirement: htc/auth{auth=SCITOKENS}
    
    apascolinit1@ui-tier1 ~
    $ export BEARER_TOKEN=$(oidc-token htc23)
    apascolinit1@ui-tier1 ~
    $ condor_q
    Error:
    ......
    
    apascolinit1@ui-tier1 ~
    $ module switch htc/ce auth=SCITOKENS num=2
    Don't forget to "export BEARER_TOKEN=$(oidc-token <client-name>)"!
    
    Switching from htc/ce{auth=SCITOKENS:num=2} to htc/ce{auth=SCITOKENS:num=2}
    Loading requirement: htc/auth{auth=SCITOKENS}
    
    apascolinit1@ui-tier1 ~
    $ export BEARER_TOKEN=$(oidc-token htc23)
    apascolinit1@ui-tier1 ~
    $ condor_q
    
    
    -- Schedd: ce02-htc.cr.cnaf.infn.it : <131.154.192.41:9619?... @ 04/17/24 15:48:24
    OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
    ..........
    ..........
    ..........
    
    -- Schedd: ce02-htc.cr.cnaf.infn.it : <131.154.192.41:9619?... @ 04/17/24 15:48:24
    OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
    ..........
    ..........
    ..........
    


All modules in the htc family provide on-line help via the "module help <module name>" command, e.g.:

  1. Code Block
    languagebash
    themeMidnight
    titleExecutable and Submit file
    budda@ui-tier1:~
     $ module help htc
    -------------------------------------------------------------------
    Module Specific Help for /opt/exp_software/opssw/modules/modulefiles/htc/local:
    
    Defines environment variables and aliases to ease the interaction with the INFN-T1 HTCondor local job submission system
    -------------------------------------------------------------------
    

Local Submission   

To submit local jobs, the behavior is the same as for HTCondor 9 using the Jobs UI.

  1. Submitting a job to the cluster.
    Code Block
    languagebash
    themeMidnight
    titleExecutable and Submit file
    apascolinit1@ui-tier1 ~
    $ cat sleep.sh
    #!/bin/env bash
    sleep $1
    
    
    apascolinit1@ui-tier1 ~
    $ cat submit.sub
    # Unix submit description file
    # subimt.sub -- simple sleep job
    
    batch_name              = Local-Sleep
    executable              = sleep.sh
    arguments               = 3600
    log                     = $(batch_name).log.$(Process)
    output                  = $(batch_name).out.$(Process)
    error                   = $(batch_name).err.$(Process)
    should_transfer_files   = Yes
    when_to_transfer_output = ON_EXIT
    
    queue
    
    Code Block
    languagebash
    themeMidnight
    titleSubmission and control of job status
    apascolinit1@ui-tier1 ~
    $ module switch htc/local ver=23
    
    apascolinit1@ui-tier1 ~
    $ condor_submit submit.sub
    Submitting job(s).
    1 job(s) submitted to cluster 15.
    
    apascolinit1@ui-tier1 ~
    $ condor_q
    
    
    -- Schedd: sn01-htc.cr.cnaf.infn.it : <131.154.192.242:9618?... @ 03/18/24 17:15:44
    OWNER        BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
    apascolinit1 Local-Sleep   3/18 17:15      _      1      _      1 15.0
    
    Total for query: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
    Total for apascolinit1: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
    Total for all users: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
    
    

...

  1. Register a Client (or upload it of an already submitted)
    Code Block
    languagebash
    themeMidnight
    titleRegister a new Client
    apascolinit1@ui-tier1 ~
    $ eval `oidc-agent-service use`
    23025
    
    apascolinit1@ui-tier1 ~
    $ oidc-gen -w device
    Enter short name for the account to configure: htc23
    [1] https://iam-t1-computing.cloud.cnaf.infn.it/
    ...
    ...
    Issuer [https://iam-t1-computing.cloud.cnaf.infn.it/]: <enter>
    The following scopes are supported: openid profile email address phone offline_access eduperson_scoped_affiliation eduperson_entitlement eduperson_assurance entitlements
    Scopes or 'max' (space separated) [openid profile offline_access]: profile wlcg.groups wlcg compute.create compute.modify compute.read compute.cancel
    Registering Client ...
    Generating account configuration ...
    accepted
    
    Using a browser on any device, visit:
    https://iam-t1-computing.cloud.cnaf.infn.it/device
    
    And enter the code: HQ2WYLREDACTED
    ...
    ...
    ...
    Enter encryption password for account configuration 'htc23': <passwd>
    Confirm encryption Password: <passwd> 
    Everything setup correctly!
  2. Get a token for submission
    Code Block
    languagebash
    themeMidnight
    apascolinit1@ui-tier1 ~
    $ oidc-add htc23
    Enter decryption password for account config 'htc23': <passwd>
    success
    
    apascolinit1@ui-tier1 ~
    $ umask 0077 ; oidc-token htc23 > ${HOME}/token
    
  3. Submit a test job
    Code Block
    languagebash
    themeMidnight
    titleSubmit file
    apascolinit1@ui-tier1 ~
    $ cat submit_token.sub
    # Unix submit description file
    # subimt.sub -- simple sleep job
    
    scitokens_file          = $ENV(HOME)/token
    +owner                  = undefined
    
    batch_name              = Grid-Token-Sleep
    executable              = sleep.sh
    arguments               = 3600
    log                     = $(batch_name).log.$(Process)
    output                  = $(batch_name).out.$(Process)
    error                   = $(batch_name).err.$(Process)
    should_transfer_files   = Yes
    when_to_transfer_output = ON_EXIT
    
    queue
    Code Block
    languagebash
    themeMidnight
    titleJob submission with Token
    apascolinit1@ui-tier1 ~
    $ module switch htc/ce auth=SCITOKENS num=1
    Don't forget to "export BEARER_TOKEN=$(oidc-token <client-name>)"!
    
    apascolinit1@ui-tier1 ~
    $ export BEARER_TOKEN=$(oidc-token htc23)
    
    apascolinit1@ui-tier1 ~
    $ condor_submit submit_token.sub
    Submitting job(s).
    1 job(s) submitted to cluster 35.
    
    apascolinit1@ui-tier1 ~
    $ condor_q
    
    
    -- Schedd: ce01-htc.cr.cnaf.infn.it : <131.154.193.64:9619?... @ 03/19/24 10:35:43
    OWNER        BATCH_NAME          SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
    apascolinius Grid-Token-Sleep   3/19 10:35      _      _      1      1 35.0
    
    Total for query: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
    Total for apascolinius: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
    Total for all users: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended


...

The SSL Submission substitution of proxy, this process is almost identical.

Warning
titleCAVEAT

Tobe To be able to submit jobs using the SSL authentication, your your x509UserProxyFQAN must be mapped in the CE configuration.
You will need to send your x509UserProxyFQAN to the support team via user-support@lists.cnaf.infn.it

The attribute can be recovered in different ways:

  • after you have a valid proxy you can retreive it with:
    Code Block
    themeMidnight
    apascolinit1@ui-tier1 ~
    $ voms-proxy-info --all
    subject   : /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini/CN=1239012205
    issuer    : /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini
    identity  : /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini
    type      : RFC3820 compliant impersonation proxy
    strength  : 2048
    path      : /tmp/x509up_u23077
    timeleft  : 11:59:53
    key usage : Digital Signature, Key Encipherment
    === VO cms extension information ===
    VO        : cms
    subject   : /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini
    issuer    : /DC=ch/DC=cern/OU=computers/CN=lcg-voms2.cern.ch
    attribute : /cms/Role=production/Capability=NULL
    attribute : /cms/Role=NULL/Capability=NULL
    timeleft  : 11:59:52
    uri       : lcg-voms2.cern.ch:15002
    the x509UserProxyFQAN will be composed by "<subject>,<attribute1>,<attribute2>...", in this case:
    Code Block
    themeMidnight
    x509UserProxyFQAN = "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini,/cms/Role=production/Capability=NULL,/cms/Role=NULL/Capability=NULL"
  • if you already have running jobs submitted with GSI auth you can get the x509UserProxyFQAN attribute with:
    Code Block
    languagebash
    themeMidnight
    apascolinit1@ui-tier1 ~
    $ condor_q -pool ce02-htc.cr.cnaf.infn.it:9619 -n ce02-htc.cr.cnaf.infn.it <job_id> -af x509UserProxyFQAN
    /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=apascoli/CN=842035/CN=Alessandro Pascolini,/cms/Role=NULL/Capability=NULL

In case your x509UserProxyFQAN hasn't been mapped into the CE configuration you will be shown the following error:

Code Block
languagebash
themeMidnight
apascolinit1@ui-tier1 ~
$ condor_submit -pool ce01-htc.cr.cnaf.infn.it:9619 -remote ce01-htc.cr.cnaf.infn.it submit_ssl.sub

ERROR: Can't find address of schedd ce01-htc.cr.cnaf.infn.it


...