All the options and the submit description commands of the condor_submit command are available in the Command Reference Manual [26]. Also for a short guide on the submit description file and its commands you can see the Appendix A.

Some helpful examples follow below.

Multiple job submission

HTCondor allows multiple job submission by using the queue command.

For jobs which don't depend on parameters, it is possible to submit the same job many times specifying queue <N> in the submission file, where <N> is an integer number.

Here's a .sub file example to submit a simple job for 3 times:

-bash-4.2$ cat sleep.sub
# Unix submit description file
# sleep.sub -- simple sleep job

executable              = sleep.sh
log                     = sleep.log
output                  = outfile$(Process).txt
error                   = errors$(Process).txt
should_transfer_files   = Yes
when_to_transfer_output = ON_EXIT
queue 3

And then run the usual commands:

-bash-4.2$ condor_submit -name sn-02.cr.cnaf.infn.it -spool sleep.sub
Submitting job(s)...
3 job(s) submitted to cluster 4588631.
-bash-4.2$
-bash-4.2$ condor_q -name sn-02.cr.cnaf.infn.it


-- Schedd: sn-02.cr.cnaf.infn.it : <131.154.192.42:9618?... @ 11/04/22 11:16:11
OWNER           BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
dlattanziobelle ID: 4588631  11/4  11:12      _      _      _      3 4588631.0-2

Total for query: 3 jobs; 3 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
Total for dlattanziobelle: 3 jobs; 3 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
Total for all users: 38425 jobs; 30994 completed, 0 removed, 636 idle, 6613 running, 182 held, 0 suspended


-bash-4.2$ condor_transfer_data -name sn-02.cr.cnaf.infn.it 4588631
Fetching data files...


On the other hand, if the jobs depend on a parameter, it is possible to provide the queue command with a list of items, for instance the list of the files that the jobs depend on.

In this case, in the submission file it is possible to define a variable (e.g. file), which can be recalled, for instance, in arguments = $(file) and then expand such variable into a list of values (it might also be a list of lists). It is possible to express each item either in a comma and/or space separated list, either by placing each of them on different lines and delimiting the list with parentheses. It is required to specify the keyword from in the queue command. For example:

executable = ...
arguments  = $(file)
...
queue file from (
    /storage/gpfs_data/.../file1
    /storage/gpfs_data/.../file2
    /storage/gpfs_data/.../fileN
)

Another way consists to compose the list with a rule and then using the keyword matching to match a specific expression. In the following example, assuming to have a set of .root files, HTCondor will submit a job for each file matching the specified rule:

executable = ...
arguments  = $(file)
...
queue file matching files /storage/gpfs_data/.../*.root

Further details are available in the official HTCondor Manual [34].

CPUs, GPUs and RAM requests

Generally, for a job it could be useful to specify the number of CPUs or maybe it would be better to specify the amount of required RAM with the options:

  • request_cpus = <number of CPUs>
  • request_memory = <RAM amount in MB>

in the command lines of the job submit file. For example, this can be the script of a submit description file with specific requests of CPUs and RAM:

-bash-4.2$ cat sleep.sub
# Unix submit description file
# sleep.sub -- simple sleep job

request_cpus = 2
request_memory = 1000
executable = sleep.sh
log = sleep.log
output = outfile.txt
error = errors.txt
should_transfer_files = Yes
when_to_transfer_output = ON_EXIT
queue

On the other hand, if your job has to use GPUs for running, you have to insert the right requirement:

+WantGPU = true
request_GPUs = 1
requirements = (TARGET.CUDACapability >= 1.2) && (TARGET.CUDADeviceName =?= "Tesla K40m") && $(requirements:True)

Jobs with ROOT-program as executable

First of all, you have to setup ROOT before using it. Your collaboration may have installed a ROOT distribution in /opt/exp_software (which is a location shared between the user-interface and the worker nodes). 

In this case you should find one or more ROOT installation directories there:

[fornaricta@ui-tier1]$ ls /opt/exp_software/cta/local_software/root/
5.34.26  5.34.36  5.34.38  root  root-6.10.08  root-6.16.00  root_build_5.34.38

so you can choose your preferred version with a submit file like the following one:

[fornaricta@ui-tier1]$ cat test.sub 
universe = vanilla
executable = test.sh
arguments = 5.34.26
output = job.out
error = job.err
log = job.log
WhenToTransferOutput = ON_EXIT
ShouldTransferFiles = YES
queue 1

where the executable file has this content:

[fornaricta@ui-tier1]$ cat test.sh
#!/bin/bash
source /storage/gpfs_data/ctalocal/fornaricta/root_config.sh $1
/opt/exp_software/cta/local_software/root/$1/bin/root -b -q

and the configuration script (located on a gpfs path, shared between the user-interface and the worker nodes) is:

[fornaricta@ui-tier1]$ cat /storage/gpfs_data/ctalocal/fornaricta/root_config.sh
#!/bin/bash
export LD_LIBRARY_PATH=/opt/exp_software/cta/local_software/root/$1/lib/root:$LD_LIBRARY_PATH

Submitting:

[fornaricta@ui-tier1]$ condor_submit -spool -name sn-02.cr.cnaf.infn.it test.sub 
Submitting job(s).
1 job(s) submitted to cluster 5824045.

[fornaricta@ui-tier1]$ condor_q -name sn-02.cr.cnaf.infn.it 5824045.0

-- Schedd: sn-02.cr.cnaf.infn.it : <131.154.192.58:9618?... @ 07/08/20 18:54:17
OWNER      BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
fornaricta ID: 5824045   7/8  18:54      _      _      1      1 5824045.0


Total for query: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended 
Total for all users: 45425 jobs; 25623 completed, 3 removed, 13181 idle, 5297 running, 1321 held, 0 suspended


[fornaricta@ui-tier1]$ condor_q -name sn-02.cr.cnaf.infn.it 5824045.0

-- Schedd: sn-01.cr.cnaf.infn.it : <131.154.192.58:9618?... @ 07/08/20 18:55:03
OWNER      BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
fornaricta ID: 5824045   7/8  18:54      _      1      _      1 5824045.0


Total for query: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended 
Total for all users: 45474 jobs; 25644 completed, 3 removed, 13222 idle, 5286 running, 1319 held, 0 suspended


[fornaricta@ui-tier1]$ condor_q -name sn-02.cr.cnaf.infn.it 5824045.0

-- Schedd: sn-01.cr.cnaf.infn.it : <131.154.192.58:9618?... @ 07/08/20 18:55:04
OWNER      BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
fornaricta ID: 5824045   7/8  18:54      _      _      _      1 5824045.0


Total for query: 1 jobs; 1 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended 
Total for all users: 45476 jobs; 25646 completed, 3 removed, 13223 idle, 5284 running, 1320 held, 0 suspended


[fornaricta@ui-tier1]$ condor_transfer_data -name sn-02.cr.cnaf.infn.it 5824045.0
Fetching data files...
[fornaricta@ui-tier1]$ cat job.err
[fornaricta@ui-tier1]$ cat job.out 
  *******************************************
  *                                         *
  *        W E L C O M E  to  R O O T       *
  *                                         *
  *   Version   5.34/26  20 February 2015   *
  *                                         *
  *  You are welcome to visit our Web site  *
  *          http://root.cern.ch            *
  *                                         *
  *******************************************


ROOT 5.34/26 (v5-34-26@v5-34-26, Jun 16 2015, 18:41:55 on linuxx8664gcc)


CINT/ROOT C/C++ Interpreter version 5.18.00, July 2, 2010
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }

If no ROOT installation is available in /opt/exp_software, you can source one of the multiple distributions available from CVMFS:

[fornarivirgo@ui01-virgo root_test]$ ls /cvmfs/sft.cern.ch/lcg/releases/ROOT/
5.34.24-64287 6.06.06-71859 6.10.00-8b404 6.12.04-4473c 6.12.06-76fef 6.14.00-66c89 
6.14.04-2a3e5 6.14.04-dedca 6.16.00-23725 6.16.00-5be98 6.16.00-b4729 6.18.00-d0330 ...

For instance:

[fornarivirgo@ui01-virgo root_test]$ cat test.sub
universe = vanilla
Executable = test.sh
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
Log = log.log
Output = log.out
Error = log.err
queue 1
[fornarivirgo@ui01-virgo root_test]$ cat test.sh
#!/bin/bash
. /cvmfs/sft.cern.ch/lcg/views/setupViews.sh LCG_96python3 x86_64-centos7-gcc8-opt
root -b -q
[fornarivirgo@ui01-virgo root_test]$ condor_submit -spool -name sn-02.cr.cnaf.infn.it test.sub
Submitting job(s).
1 job(s) submitted to cluster 8445482.
[fornarivirgo@ui01-virgo root_test]$ condor_q -name sn-02.cr.cnaf.infn.it 8445482


-- Schedd: sn-02.cr.cnaf.infn.it : <131.154.192.58:9618?... @ 09/10/20 17:21:42
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
fornarivirgo ID: 8445482 9/10 17:21 _ _ 1 1 8445482.0

Total for query: 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended 
Total for all users: 39956 jobs; 21240 completed, 2 removed, 13371 idle, 4218 running, 1125 held, 0 suspended

[fornarivirgo@ui01-virgo root_test]$ condor_q -name sn-02.cr.cnaf.infn.it 8445482


-- Schedd: sn-02.cr.cnaf.infn.it : <131.154.192.58:9618?... @ 09/10/20 17:23:58
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS
fornarivirgo ID: 8445482 9/10 17:21 _ _ _ 1 8445482.0

Total for query: 1 jobs; 1 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended 
Total for all users: 39959 jobs; 21288 completed, 2 removed, 13341 idle, 4202 running, 1126 held, 0 suspended

[fornarivirgo@ui01-virgo root_test]$ condor_transfer_data -name sn-02.cr.cnaf.infn.it 8445482
Fetching data files...
[fornarivirgo@ui01-virgo root_test]$ cat log.err 
[fornarivirgo@ui01-virgo root_test]$ cat log.out
------------------------------------------------------------
| Welcome to ROOT 6.18/00 https://root.cern                |
| (c) 1995-2019, The ROOT Team                             |
| Built for linuxx8664gcc on Jun 25 2019, 09:22:23         |
| From tags/v6-18-00@v6-18-00                              |
| Try '.help', '.demo', '.license','.credits', '.quit'/'.q'|
------------------------------------------------------------
  • No labels