Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

FMRIB hosts a small scale HPC facility with interactive, visualisation capable servers - instructions on suitability and usage

Introduction

The FMRIB IT department provide several interactive servers and a compute farm for you to use for your analysis. The machines and their uses are detailed below. 

High-level overview

Interactive and cluster submission computers

Jalapeno.fmrib.ox.ac.uk/jalapeno.cluster.fmrib.ox.ac.uk (internal networks)

This is our general purpose Linux computer, available to connect to via SSH from a University network. External to the University, use a VPN connection.

Appropriate uses:

  • Submitting jobs to the grid engine computing cluster
  • Looking at analysis results, e.g. FEAT reports, FSLView 
  • Graphical desktop sessions using VNC
    • The number of connections is limited - please don't leave them running if you don't need them for more than a day or two.

Jalapeno must not be used to do computationally intensive tasks - use the cluster or another machine. For example we do not allow (and actively prevent) the use of MATLAB on this machine. 

As this machine is our public facing computer it needs to be kept up-to-date with security updates, so this machine will be rebooted on a regular basis (as required by the updates necessary). The maintenance window is the second Tuesday of the month, 9-10am, unless a pressing security need requires more urgent action. The machine should be considered at risk during this period, but we aim to advise in advance of any required reboot.

Jalapeno00.cluster.fmrib.ox.ac.uk

This is our medium-memory (128GB) general purpose computer. Access to this machine from outside of the FMRIB network is via VPN or via a jalapeno SSH Appropriate uses: 

  • Interactive MATLAB sessions: 

    • Where possible MATLAB processing should be done on the cluster using the interactive queues or, preferably, the work should be scripted such that you can submit to the normal queues. However, where you have an interactive task that is likely to take a long time you may use this machine. 

    • This is a shared computer, so anything requiring a large amount of memory (>8GB) should be run elsewhere (interactive queues, or scripted/compiled and using the normal queues) We would also recommend that any long running process should take steps to save intermediate results - we cannot guarantee 100% up time from any computer. 

  • Queue Submission

    • Normally you would use Jalapeno, but submission is also possible from this machine (for example, perhaps from a workflow that involves MATLAB). 

  • VNC sessions

    • If you have a long running process that can't run on the cluster use this machine as it will be need to be restarted much less regularly than jalapeno. 

As this machine is not public facing, updates will be installed on a less aggressive schedule. Consequently, the machine will need to be rebooted much less frequently than Jalapeno.

Jalapeno18.cluster.fmrib.ox.ac.uk

This is our high-memory (1.25TB) general purpose computer. Access to this machine from outside of the FMRIB network is via VPN or via a jalapeno SSH Appropriate uses: 

  • Interactive MATLAB sessions: 

    • Where possible MATLAB processing should be done on the cluster using the interactive queues or, preferably, the work should be scripted such that you can submit to the normal queues. However, where you have an interactive task that is likely to take a long time you may use this machine. 

    • This is a shared computer, provided for large memory tasks. Please use jalapeno00 for more general interactive compute tasks. We would also recommend that any long running process should take steps to save intermediate results - we cannot guarantee 100% up time from any computer. 

  • Queue Submission

    • Normally you would use Jalapeno, but submission is also possible from this machine (for example, perhaps from a workflow that involves MATLAB). 

  • VNC sessions

    • If you have a long running process that can't run on the cluster use this machine as it will be need to be restarted much less regularly than jalapeno. 

As this machine is not public facing, updates will be installed on a less aggressive schedule. Consequently, the machine will need to be rebooted much less frequently than Jalapeno.

Cuda03.cluster.fmrib.ox.ac.uk

If your interactive task requires access to NVIDIA CUDA capable hardware then the computer cuda03.cluster.fmrib.ox.ac.uk provides access to two K80 class GPUs. These are provided on a first-come first-served basis.

Interactive MATLAB

Wherever possible MATLAB should be run on the cluster, where the task can only be run in an interactive fashion we provide interactive queues. Details are available in the queue submission documentation

How to submit jobs to the FMRIB cluster and how to monitor them

​Introduction​

​WIN operate a compute cluster formed from rack-mounted multi-core computers. To ensure eficient use of the hardware, tasks are distributed amongst these computers using ​grid scheduling software. This software monitors the utilisation of the computers in the cluster, launching new jobs onto the least used computers, preventing over loading of machines whilst ensuring a fair share of compute resources amongst all users of the system.​ When you submit a job it will sit in a queue until such time as the scheduler software identifies a viable empty slot and your job has reached the top of the queue. The fair share algorithm in use ensures that heavy users of the system are less likely to reach the top than users who rarely use the system (this is cleared on a regular basis so that you aren't deprioritised forever).

Grid E​​ngine and the​​ queues

WIN's cluster runs the Grid Engine (GE) queuing software (using the Son of Grid Engine distribution). To ease job submission we provide a helper called fsl_sub​ which sets some useful options to Grid Engine's built-in qsub​ command.

GE manages a set of queues​ representing the available resources. Tasks are submitted to GE queues for distribution across the execution hosts. These queues are designed to divide the resources according to usage profiles to ensure that the majority of tasks get done in a favourable time-frame (see Jalapeno Queues). 

​The Jal​​apeno Cluster

The jalapeno cluster consists of a main user-accessible computer (some times called a Head Node) 'jalapeno' and a farm of processing nodes which are not directly visible. 'Jalapeno' itself should only be used to submit tasks and view results. All other tasks must be run via the job submission system here; in this way jobs get shared among the available processing nodes.
Any non-trivial jobs found running on Jalapeno will be killed without warning if they impact on others use of the computer.

Summary of available interactive servers

Server name 

Purpose 

Restrictions 

jalapeno.fmrib.ox.ac.uk / jalapeno.cluster.fmrib.ox.ac.uk (internal connections)

Interactive SSH/VNC/X11 sessions, queue submission.

No long running or high memory tasks - these MUST be run on the cluster. This includes MATLAB, which should be run in a batch process on the cluster, on a desktop machine or on Jalapeno00/18. 

jalapeno00.cluster.fmrib.ox.ac.uk 

Long running interactive sessions, interactive MATLAB, other interactive compute tasks

 

jalapeno18.cluster.fmrib.ox.ac.uk

Large memory requirement, long running interactive sessions, interactive MATLAB, other interactive compute tasks

 

jalapeno01-11,19-23.cluster.fmrib.ox.ac.uk 

Compute cluster 

No direct logins allowed, jobs should be submitted to the queue system

cuda03.cluster.fmrib.ox.ac.uk

Interactive CUDA development (2xK80 cores)

 

cuda01-05.cluster.fmrib.ox.ac.uk

Compute cluster (CUDA)

No direct logins allowed, jobs should be submitted to the queue system