Fair-Share

How the scheduler manages fair share of resources amongst all users

To prevent a single user from using all available resources on the cluster we have a range of 'Quality of Service' rules that are enforced. The current rules came into force on the 14/5/2025 and are detailed below:

Partition	QOS Limits across partition
short	48 Jobs 96 CPUs (threads) 1.5TB of memory across all jobs 256GB of memory per job
long	16 Jobs 32 CPUs (threads) 768GB of memory across all jobs 256GB of memory per job
gpu_short	4 jobs 40 CPUs (threads) 128GB of memory across all jobs 96GB of memory per job minimum of 1 GPU maximum of 4 GPUs
gpu_long	2 jobs 40 CPUs (threads) 128GB of memory across all jobs 96GB of memory per job 1 GPU
Interactive Jobs	1 of each of Desktop, MATLAB, Jupyter and RStudio 8 CPUs (threads) (minimum of 2) 256 GB of memory

At the discretion of IT staff some of these rules can be relaxed on a per-job basis, but where this is done, these jobs will be restricted to one at a time.

As the CPU only batch nodes all service the `short` and `long` partitions, there is an overall limit on the number of threads available across each partition. This is currently set to the following:

Short: 72 threads per device (288 threads total)
Long: 36 threads per device (144 threads total)

Your job's position in the queue depends on several factors:

How many jobs you have run in the last three days
The CPU threads, memory and GPUs you have requested, in the ratio 1:2:1
How long you have been waiting for

Interactive jobs have a different usage history to queued jobs, and so submitting lots of jobs or running interactive sessions for long periods/regularly will not effect your priority for jobs of the alternate type.

Cookies on this website

Fair-Share