Multi-Threaded Software
How to request a multi-threaded slot and how to ensure your software only uses the CPU cores it has been allocated
Running multi-threaded programs can cause significant problems with cluster scheduling software if the clustering software is not made aware of the multiple threads (your job is allocated one slot but actually consumes many more, often ALL the CPUs, overloading the machine).
We support the running of shared memory multi-threaded software only (e.g. OpenMP, multi-threaded MKL, OpenBLAS etc) and we attempt to limit the threads used automatically. We do not support multi-node parallel programs (OpenMPI) which typically requires specialist hardware to operate optimally.
Where software would normally use all available cores you should invesigate how to limit these threads (see the MATLAB guide for example).
To submit an OpenMP job, use the -s (or --parallelenv) option to fsl_sub. For example:
fsl_sub -s openmp,2 <command or script>
"openmp" is the name of the parallel environment to use; "2" is the number of threads you wish to allocate to your jobs. The openmp parallel environment is enabled most queues, see:
fsl_sub --help
for details.
The task running on the queue will be able to determine how many slots it has by querying the environment variable pointed to by FSLSUB_NSLOTS. For example in BASH the number of slots is equal to ${!FSLSUB_NSLOTS}.
In Python you would be able to get this figure with the following code:
import os slots = os.environ[os.environ['FSLSUB_NSLOTS']]
To be able to provide these threads the cluster software needs to reserve slots on compute nodes, so we request you avoid requesting them on the verylong.q as this can easily result in the queue being reserved for a very long time whilst waiting for sufficient slots to be freed.