Advanced Usage

More advanced techniques for submitting jobs, e.g. GPU, array and MATLAB tasks and full fsl_sub usage information

If your task comprises a complicated pipeline of interconnected tasks there are several options for splitting into dependent tasks or parallelisation of independent portions across many cluster nodes. Information on these techniques and other advance options is in this section.

Cluster advanced usage

How to submit independent 'clone' tasks for running in parallel

An array task is a set of closely related tasks that do not rely on the output of any other members of the set of jobs. An example might be where you need to process each slice of a brain volume but there is no need to know or effect the content of any other slice (the array tasks can't communicate with each other to advise of changes to data). These tasks allow you to submit large numbers of discrete jobs and manage them under one job id, with each sub-task being allocated a unique task id and potentially able to run in parallel given enough compute slot availability.

You can submit an array task with the -t/--array_task option or with the --array_native option:

TEXT FILE ARRAY TASKS

The -t (or --array_task) option needs the name of a text file that contains the array task commands, one per line. Sub-tasks will be generated from these lines, with the task ID being equivalent to the line number in the file (starting from 1). e.g.

fsl_sub -q short.q -t ./myparalleljobs

The array task has a parent job id which can be used to control/delete all of the sub-tasks, the sub-tasks may be specified as job id:sub-task id, eg ''12345:10'' for sub-task 10 of job 12345.

NATIVE ARRAY TASKS

The --array_task option requires an argument n[-m[:s]] which specifies the array:

n provided alone will run the command n-times in parallel
n-m will run the command once for each number in the range with task ids equal to the position in this range
n-m:s similarly, but with s specifying the increment in task id.

The cluster software will set environment variables that the script/binary can use to determine what task they need to carry out. For example, this might be used to represent the brain volume slice to process. As these environment variables differ between different cluster software, fsl_sub sets several environment variables to the name of the environment variable the script can use to obtain it's task id from the cluster software:

Envrionment variable	...points to variable containing
FSLSUB_JOBID_VAR	job id
FSLSUB_ARRAYTASKID_VAR	task id
FSLSUB_ARRAYSTARTID_VAR	first task id
FSLSUB_ARRAYENDID_VAR	last task id
FSLSUB_ARRAYSTEPSIZE_VAR	step between task ids
FSLSUB_ARRAYCOUNT_VAR	number of tasks in array (not supported in Grid Engine)

To use these you need to look up the variable name and then read the value from the variable, for example in BASH use ${!FSLSUB_ARRAYTASKID_VAR} to get the value of the task id.

Important The tasks must be truly independent - ie, they must not write to the same file(s) or rely on calculations in other array jobs in this set otherwise you may get unpredictable results (or sub-tasks may crash).

LIMITING CONCURRENT ARRAY TASKS

Sometimes it may be necessary to limit the number of array sub-tasks runnning at any one time. You can do this by providing the -x (or --array_limit) option which takes a integer, e.g.:

fsl_sub -T10 -x 10 -t ./myparalleljobs

Will limit sub-tasks to ten running at any one time.

ARRAY TASKS WITH THE SHELL RUNNER

If running without a cluster backend or when fsl_sub is called from within an already scheduled task, the shell backend is capable of running array tasks in parallel. If running as a cluster job, the shell plugin will run no more than the number of threads selected in your parallel environment (if one is specified, default is one task at a time).

If you are not running on a cluster then by default fsl_sub will use all of the CPUs on your system. You can control this either using the -x|--array-limit option or by setting the environment variable FSLSUB_PARALLEL to the maximum number of array tasks to run at once. It is also possible to configure this in your own personal fsl_sub configuration file (see below).

How to submit pipeline stages such that they wait for their predecessor to complete

If you have a multi-stage task to run, you can submit the jobs all at once, specifying that later stages must wait for the previous task to complete. This is achieved by providing the '-j' (or --jobhold) option with the job id of the task to wait for. For example:

jid=$(fsl_sub -q veryshort.q ./my_first_stage)

fsl_sub -q long.q -j $jid ./my_second_stage

Note the $() surrounding the first fsl_sub command, this captures the output of a command and stores the text in the variable 'jid'. This is then passed as the job id to wait for before running 'my_second_stage'.

Jobs set to wait for the completion of another task will appear in the queue with state 'hqw'.

It is also possible to submit array holds with the --array_hold command which takes the job id of the predecessor array task. This can only be used when both the first and subsequent job are both array tasks of the same size (same number of sub-tasks) and each sub-task in the second array depends only on the equivalent sub-task in the first array.

How to request a GPU for your job

Whilst GPU tasks can simply be submitted to the cuda.q queue fsl_sub also provides helper options which can automatically select a GPU queue and select the appropriate CUDA toolkit for you.

If we were to have different hardware capabilities on GPUs (we don't at FMRIB) then it can also select specific card types. The options of interest all begin --coprocessor:

-c|--coprocessor <coprocessor name>: This selects the coprocessor with the given name (see fsl_sub --help for details of available coprocessors)
--coprocessor_multi <number>: This allows you to request multiple GPUs. On the FMRIB cluster you can select no more than two GPUs. You will automatically be given a two-slot openmp parallel environment
--coprocessor_class <class>: (Not relevant at FMRIB) This would allow you to select which GPU hardware model you require, e.g. V for Volta cards
--coprocessor_class_strict: If a class is requested you will normally be allocated a card at least as capable as the model requested. By adding this option you ensure that you only get the GPU model you asked for
--coprocessor_toolkit <toolkit version>: This allows you to select the API toolkit your sofware needs. This will automatically make available the requested CUDA libraries where these haven't been compiled into the software

How to request a multi-threaded slot and how to ensure your software only uses the CPU cores it has been allocated

Running multi-threaded programs can cause significant problems with cluster scheduling software if the clustering software is not made aware of the multiple threads (your job is allocated one slot but actually consumes many more, often ALL the CPUs, overloading the machine).

We support the running of shared memory multi-threaded software only (e.g. OpenMP, multi-threaded MKL, OpenBLAS etc) and we attempt to limit the threads used automatically. We do not support multi-node parallel programs (OpenMPI) which typically requires specialist hardware to operate optimally.

Where software would normally use all available cores you should invesigate how to limit these threads (see the MATLAB guide for example).

To submit an OpenMP job, use the -s (or --parallelenv) option to fsl_sub. For example:

fsl_sub -s openmp,2 <command or script>

"openmp" is the name of the parallel environment to use; "2" is the number of threads you wish to allocate to your jobs. The openmp parallel environment is enabled most queues, see:

fsl_sub --help

for details.

The task running on the queue will be able to determine how many slots it has by querying the environment variable pointed to by FSLSUB_NSLOTS. For example in BASH the number of slots is equal to ${!FSLSUB_NSLOTS}.

In Python you would be able to get this figure with the following code:

import os
slots = os.environ[os.environ['FSLSUB_NSLOTS']]

To be able to provide these threads the cluster software needs to reserve slots on compute nodes, so we request you avoid requesting them on the verylong.q as this can easily result in the queue being reserved for a very long time whilst waiting for sufficient slots to be freed.

How to submit non-interactive MATLAB scripts to the queues

Wherever possible DO NOT run full MATLAB directly on the cluster, instead compile your code (see the MATLAB page) but where this is not possible or you only need to run a quick single job task it is acceptable to run the full MATLAB environment on the cluster.

Any non-interactive MATLAB task needs to be submitted by creating a file (typically with the extension '.m'), eg 'myfile.m' with all your MATLAB commands in and submit it using 'fsl_sub'; once the task is running you can look at the file "matlab.o<jobid>" for any output.

fsl_sub -q short.q matlab -singleCompThread -nodisplay -nosplash \< mytask.m

NB The "\" is very important since MATLAB won't read your script otherwise.

Warning: MATLAB tasks will often attempt to carry out some operations using multiple threads. Our cluster is configured to run only single thread programs, so if multiple threads are used you can overload it. The ''-singleCompThread'' option disables this multi-threading.

Although many built-in functions in MATLAB are multi-threaded the cluster is configured to prevent any task running threads on more than one CPU unless run within a parallel environment.

Request a parallel environment and ensure that your MATLAB script includes the lines:

NSLOT_VAR = getenv('FSLSUB_NSLOTS')
N = getenv(NSLOT_VAR)
LASTN = maxNumCompThreads(N)

For tasks that require graphical output (for example to display progress bars), please see the Virtual X11, and where you must interact with the process see the section on the interactive queue.

Other potentially useful submission options or techniques

Capturing job submission information

fsl_sub can store the commands used to submit the job if you provide the option --keep_jobscript. When provided, post submission you will find a file in the current folder (assuming you have write permissions there) a script called wrapper-<jobid>.sh. This exact submission may be repeated by using:

fsl_sub -F wrapper-<jobid>.sh

The script contents is described below:

#!/bin/bash	Run the script in BASH
#$ GRID ENGINE OPTION	Grid Engine options
#$ GRID ENGINE OPTION	Grid Engine options
module load <module name>	Load a Shell Module
# Built by fsl_sub v.2.3.0 and fsl_sub_plugin_sge v.1.3.0	Version of fsl_sub and plugin that submitted the job
# Command line: <command line>	Command line that invoked fsl_sub
# Submission time (H:M:S DD/MM/YYYY) <date/time>	Date and time that the job was submitted
<command>

Passing Environment Variables to Queued Jobs

On the jalapeno cluster, by default, your entire shell environment (settings) are transferred to your job when it starts up. On some systems (for example the BMRC cluster) this is not possible and so fsl_sub allows you to specify environment variables that should be transferred to the job. This can also be useful if you are scheduling many similar tasks and need to specify a different value for an environment variable for each run, for example SUBJECTS_DIR which FreeSurfer uses to specify where your data sets reside. The --export option

SKIPPING COMMAND VALIDATION

By default fsl_sub will check the command given (or the commands in the lines in an array task file) can be found and are executable. If this causes issues, often because a particular program is only available on the compute nodes, not on jalapeno itself, then you can disable this check with -n (--novalidation).

AVOIDING JOB REQUEUING

If there is a problem with a compute node causing loss of communications with the cluster manager whilst your job is running or discovery of an issue that requires an immediate reboot, admins can request that your job is moved to a new node or on node reboot the cluster will automatically start the job on a new node. This move starts your task again from the beginning which can cause issues if your job modifies the file system in a way that destroys data necessary for earlier stages of the processing pipeline. If this is the case your job should be started with the --no_requeueable option, this prevents the job automatically restarting and potentially wasting time processing a job that cannot complete successfully.

SUBMITTING TO A SPECIFIC HOST OR RANGE OF QUEUES

If you need to submit to specific host you can achieve this by appending '@hostname' to the queue name, for example -q short.q@jalapeno01 would submit to the short.q on jalapeno01.

To submit to a range of queues (not normally necessary on jalapeno's cluster) you can comma separate the queue names, for example -q short.q,veryshort.q.

VIRTUAL X11 SERVER OR HOW TO RUN A SUBSET OF GUI BASED APPLICATIONS ON THE CLUSTER

Some programs insist on displaying something in a window, even it is just a progress bar. If you attempt to run these applications on the cluster they will immediately fail as the machine has no where to display this progress bar. Where possible try to find a way to run the program without this graphical output (maybe it has a command-line option to run it in a textual mode), but in the cases when this is not possible, the cluster nodes have the X11 Virtual Frame Buffer software installed.

To ease the use of this program, we have provided a wrapper script that starts and stops the Xvfb process for you in a more automated manner, returning the display number you have been allocated. You can use this information to set the '''DISPLAY''' environment variable before running your program. The following script creates the dummy display, sets '''DISPLAY''', runs the program ''a_graphical_program'' and then destroys the dummy display.

      #!/bin/bash
      disp=$(/opt/fmrib/bin/virtual-x -f -q)
      export DISPLAY=":$disp" a_graphical_program
      virtual-x -k $disp

To use with your own program, replace ''a_graphical_program'' with the path and arguments for your particular program. The resulting script can then be submitted to the cluster using the fsl_sub command.

SUBMITTING GRID ENGINE SCRIPTS

If you have a particularly complicated job that can't be configured using the fsl_sub options then you can write your own script as per the Grid Engine documentation and pass this to fsl_sub with the -F (--usescript) option. All other options will be ignored/overridden. You can use this to resubmit a stored job script generated with the --keep_jobscript option.

CONTROLLING SCHEDULING PRIORITY - JOB URGENCY

If your job is not urgent then you can suggest it only runs when the system is quiet by specifying a lower priority for the task with the -p (or --priority) option. Specify a number between -1023 (lowest priority) to 0 (normal priority). If you have a particularly urgent job then contact computing-help@win.ox.ac.uk to discuss raising the priority of your job above 0.

JOB PROJECTS

If you have been asked to run your jobs under a specific project name (this would typically be used to allow easy auditing of compute use by a particular project, or potentially to allow access to restricted resources) then you can use the --project option to specify a project. If you can't do this (for example if you are running an auto-submitting program such as FEAT) then you can also set the environment variable FSLSUB_PROJECT and this name will be used by fsl_sub commands within the program, e.g.

FSLSUB_PROJECT=myproject feat mydesign.feat

or where you always use the same project add the following to your .bash_profile:

export FSLSUB_PROJECT=myproject

REQUESTING A SPECIFIC RESOURCE

Some resources may have a limited quantity available for use, e.g. software licenses or RAM. fsl_sub has the ability to request these resources from the cluster (the --coprocessor options do this to automatically to request the appropriate number of GPUs). The option -r (--resource) allows you to pass a resource string directly through to the Grid Engine software. If you need to do this you will be advised by the computing help team or software documentation the exact string to pass.

Environment variables that can be set to control fsl_sub submitted tasks

Available Environment Variables

fsl_sub sets or can be controlled with the following shell variables. These can be set either for the duration of the fsl_sub run by prepending the call with the setting of the value:

ENVVAR=VALUE fsl_sub ...

or by exporting the value to your shell so that all subsequent calls will also have this variable set this way:

export ENVVAR=VALUE

Envrionment variable	Who sets	Purpose	Example values
FSLSUB_JOBID_VAR	fsl_sub	Variable name of Grid job id	JOB_ID
FSLSUB_ARRAYTASKID_VAR	fsl_sub	Variable name of Grid task id	SGE_TASK_ID
FSLSUB_ARRAYSTARTID_VAR	fsl_sub	Variable name of Grid first task id	SGE_TASK_FIRST
FSLSUB_ARRAYENDID_VAR	fsl_sub	Variable name of Grid last task id	SGE_TASK_LAST
FSLSUB_ARRAYSTEPSIZE_VAR	fsl_sub	Variable name of Grid step between task ids	SGE_TASK_STEPSIZE
FSLSUB_ARRAYCOUNT_VAR	fsl_sub	Variable name of Grid number of tasks in array	Not supported in Grid Engine
FSLSUB_MEMORY_REQUIRED	You	Advise fsl_sub of expected memory required	32G
FSLSUB_PROJECT	You	Name of Grid project to run jobs under	MyProject
FSLSUB_PARALLEL	You/fsl_sub	Control array task parallelism when running without a cluster engine (e.g. when a queued task itself submits an array task)	4 (for four threads), 0 to let fsl_sub's shell plugin use all available cores
FSLSUB_CONF	You	Provides the path to the configuration file	/usr/local/etc/fslsub_conf.yml
FSLSUB_NSLOTS	fsl_sub	Variable name of Grid allocated slots	NSLOTS
FSLSUB_DEBUG	You/fsl_sub	Enable debugging in child fsl_sub	1
FSLSUB_PLUGINPATH	You	Where to find installed plugins (do not change this variable)	/path/to/folder
FSLSUB_NOTIMELIMIT	You	Disable notification of job time to the cluster	1

Where a FSLSUB_* variable is a reference to another variable you need to read the content of the referred to variable. This can be achieved as follows:

BASH: the number of slots is equal to ${!FSLSUB_VARIABLE}

Python:

import os
value = os.environ[os.environ['FSLSUB_VARIABLE']]

MATLAB:

NSLOT_VAR = getenv('FSLSUB_VARIABLE')
N = getenv(NSLOT_VAR)

How to change fsl_sub's configuration for all jobs you run

Some of the operation of fsl_sub can be configured such that all runs will enable/disable features. To configure fsl_sub create a file ~/.fsl_sub.yml and add the configuration to this file - it is in YAML format. To see what the current configuration is use:

fsl_sub --show_config

Take care - the system configuration has been setup to be optimal for the cluster, changing these settings may cause your job to fail.

FSL_SUB.YML SECTIONS

TOP LEVEL

These options control the basic operation of fsl_sub and are keys in a YAML dictionary. To change a setting add 'keyname: value' to your file with no indent.

Key name	Default	Purpose	Examples/Allowed Options
method	'shell', 'slurm' (or 'sge')	Define whether to use the cluster ('slurm') or run things without a cluster ('shell')	'shell' or the name of an installed plugin, e.g. 'slurm'
ram_units	'G'	When -R is specified, what are the units	'K', 'M', 'G', 'T', 'P'(!) - recommend this is not changed
modulecmd	False	Where 'modulecmd' is not findable via PATH, where is the program	Path to modulecmd
export_vars	Empty list = []	List of environment variables (with optional values) to always pass to jobs running on the cluster. List you provide will be added to the default list	[SUBJECTSDIR, "MYVARIABLE=MYVALUE"] The list can also be specified by starting a new line and adding items as ' - SUBJECTSDIR' (note the two spaces before the '-') on separate lines
thread_control	['OMP_NUM_THREADS', 'MKL_NUM_THREADS', 'MKL_DOMAIN_NUM_THREADS', 'OPENBLAS_NUM_THREADS', 'GOTO_NUM_THREADS']	Environment variables to set to ensure threads are limited to those requested by a parallel envrionment. Any values you configure will be added to the default list.	Names of environment variables
method_opts	{}	Control the method that runs your job	See below
coproc_opts	{}	Control the coprocessor options	Should not be changed
queues	{}	Control the queues	Must not be changed

METHOD_OPTS

These control how the shell and sge job runners operate, most of these should not be changed, but some useful ones include:

method_opts:
  shell:
    parallel_disable_matches:
      - "*_string"

parrallel_disable_matches enables you to specify portions of a command name that should never be attempted to be run in parallel when submitted as an array task but running with the shell backend. The default list contains '*_gpu' which ensures that the FSL GPU enabled tools do not attempt to start up in parallel as they are likely to be unable to access multiple GPUs. fsl_sub supports matching a full program name, a full path to a program and *<name> and <name>* to match the end or start of a program name respectively.

method_opts:
  slurm:
    keep_jobscript: True|False

or for the legacy Jalapeno cluster:

method_opts:
  sge:
    keep_jobscript: True|False

When the cluster backends submit your job they generate a submission script, the keep_jobscript option will leave a copy of this script in the current folder for reference or for later reuse

You can also control this on a job by job basis with the option --keep_jobscript, but where tasks don't allow this (e.g. FEAT) you can control this here.

Cookies on this website

Advanced Usage

TEXT FILE ARRAY TASKS

NATIVE ARRAY TASKS​​

LIMITING CONCURRENT ARRAY TASKS​

ARRAY TASKS WITH THE SHELL RUNNER

Capturing job submission information​

Passing Environment Variables to Queued Jobs

SKIPPING COMMAND VALIDATION

AVOIDING JOB REQUEUING​

SUBMITTING TO A SPECIFIC HOST OR RANGE OF QUEUES​

VIRTUAL X11 SERVER OR HOW​​ TO RUN A SUBSET OF GUI BASE​​D APPLICATIONS ON THE CLUSTER

SUBMITTING GRID ENGINE SCRIPTS​