Submit Jobs

Here is what a typical job submission of shell script would look like from the command line:

qsub -cwd -pe smp 4 -l mem_free=2G -l scratch=50G -l h_rt=00:20:00

This job submission will submit to the job scheduler which will eventually launch the job on one the compute nodes that can meet the resource needs of the job. Exactly, what these options are is explained below sections, but in summary, the above will result in:

Submit a script to run in the current working directory

To submit a shell script to the scheduler such that it will run in the current working directory (-cwd), use:

qsub -cwd

The scheduler will assign your job a unique (numeric) job ID.

Specifying (maximum) memory usage

Unless specified, the maximum amount of memory used at any time is 1 GiB per slot (-l mem_free=1G). A job that need to use more memory, need to request that when submitted. For example, a job that needs (at most) 10 GiB of memory should be submitted as:

qsub -cwd -l mem_free=10G

The scheduler will launch this jobs on the first available compute node with that amount of memory available.

TIPS: Add qstat -j $JOB_ID to the end of your script to find out how much memory and CPU time your job needed. See Job Summary page for more details.

Specifying (maximum) run time

By specifying the how long each job will take, the better the scheduler can manage resources and allocate jobs to different nodes. This will also decrease the average waiting time the job will sit in the queue before being launched on a compute node. You can specify the maximum run time (= wall time, not CPU time) for a job using option -l h_rt=HH:MM:SS where HH:MM:SS specifies the number of hours (HH), the number of minutes (MM), and the number of seconds (SS) - all parts must be specified. For instance, the following job is expected to run for at most 3 minutes (180 seconds):

qsub -cwd -l mem_free=2G -l h_rt=00:03:00

Using local scratch storage

Each compute node has 0.1-1.8 TiB of local scratch storage which is fast and ideal for temporary, intermediate data files that are only needed for the length of a job. This scratch storage is unique to each machine and shared among all users and jobs running on the same machine. To minimize the risk of launching a job on a node that have little scratch space left, specify the -l scratch=size resource. For instance, if your job requires 200 GiB of local /scratch space, submit the job using:

qsub -cwd -l scratch=200G

Your job is only guaranteed the amount of available scratch space that you request when it is launched. For more information and best practices, see Using Local /scratch on Compute Nodes.

If your job would benefit from extra-fast local scratch storage, then you can request a node with either a SSD or NVMe scratch drive via the following flag:

qsub -l ssd_scratch=1

Parallel processing (on a single machine)

The scheduler will allocate a single core for your job. To allow the job to use multiple slots, request the number of slots needed when you submit the job. For instance, to request four slots (NSLOTS=4) each with 2 GiB of RAM, for a total of 8 GiB RAM, use:

qsub -pe smp 4 -l mem_free=2G

The scheduler will make sure your job is launched on a node with at least four slots available.

Note, when writing your script, use SGE environment variable NSLOTS, which is set to the number of cores that your job was allocated. This way you don’t have to update your script if you request a different number of cores. For instance, if your script runs the BWA alignment, have it specify the number of parallel threads as:

bwa aln -t $NSLOTS ...

Comment: PE stands for ‘Parallel environment’. SMP stands for ‘Symmetric multiprocessing’ and indicates that the job will run on a single machine using one or more cores.

Minimum network speed (1 Gbps, 10 Gbps, 40 Gbps)

The majority of the compute nodes have 1 Gbps and 10 Gbps network cards while a few got 40 Gbps cards. A job that requires 10-40 Gbps network speed can request this by specifying the eth_speed=10 (sic!) resource, e.g.

qsub -cwd -l eth_speed=10

A job requesting eth_speed=40 will end up on a 40 Gbps node, and a job requesting eth_speed=1 (default) will end up on any node.

Passing arguments to script

You can pass arguments to a job script similarly to how one passes argument to a script executed on the command line, e.g.

qsub -cwd -l mem_free=1G --first=2 --second=true --third='"some value"' --debug

Arguments are then passed as if you called the script as --first=2 --second=true --third="some value" --debug. Note how you have to have an extra layer of single quotes around "some value", otherwise will see --third=some value as two independent arguments (--third=some and value).

Interactive jobs

It is currently not possible to request interactive jobs (aka qlogin). Instead, there are dedicated development nodes that can be used for short-term interactive development needs such building software and prototyping scripts before submitting them to the scheduler.

MPI: Parallel processing via Hybrid MPI (multi-threaded multi-node MPI jobs)

Wynton provides a special MPI parallel environment (PE) called mpi-8 that allocates exactly eight (8) slots per node across one or more compute nodes. For instance, to request a Hybrid MPI job with in total forty slots (NSLOTS=40), submit it as:

qsub -pe mpi-8 40

and make sure that the script (here exports OMP_NUM_THREADS=8 (the eight slots per node) and then launches the MPI application using mpirun -np $NHOSTS /path/to/the_app where NHOSTS is automatically set by SGE (here NHOSTS=5):

#! /usr/bin/env bash
#$ -cwd   ## SGE directive to run in the current working directory

mpirun -np $NHOSTS /path/to/the_app

Comment: MPI stands for ‘Message Passing Interface’.

See also

For further options and advanced usage, see Advanced Usage of the scheduler.