Writing a SLURM Submission Script
A submission script is a key part of running jobs on an HPC cluster. The script has two parts:
- A description of what resources and properties for the the job being submitted
- A payload that will be executed to get work done
A basic submission script
In the most basic case, all #SBATCH
options could be omitted.
However, there are a few basic parts that should be included in nearly every job.
#!/bin/bash
#SBATCH -N 1 (1)
#SBATCH -n 2 (2)
#SBATCH --mem=8g (3)
#SBATCH -J "Hello World Job" (4)
#SBATCH -p short (5)
#SBATCH -t 12:00:00 (6)
echo "Hello World" # (7)!
- Request 1 node for the job
- Request 2 CPU cores
- Request 8 GiB of memory
- Use
Hello World Job
as the job title - Use the
short
partition - Set the maximum time for the job to run. If the job has not completed after this timeout, the job will be killed.
- Script content.
Submission script for GPU use
#!/bin/bash
#SBATCH -N 1 (1)
#SBATCH -n 8 (2)
#SBATCH --mem=8g (3)
#SBATCH -J "Example Job" (4)
#SBATCH -p short (5)
#SBATCH -t 12:00:00 (6)
#SBATCH --gres=gpu:2 (7)
#SBATCH -C A100|V100 (8)
module load python # (9)!
module load cuda12.2/toolkit # (11)!
python my_script_name.py # (10)!
- Request 1 node for the job
- Request 8 CPU cores
- Request 8 GiB of memory
- Use
Example Job
as the job title - Use the
short
partition - Set the maximum time for the job to run. If the job has not completed after this timeout, the job will be killed.
- Request 2 GPUs
- [Optional]: Limit GPUs to the
A100
orV100
types. - Load the latest stable version of the
python
module. For more information, see Software. - Run your script here.
- Load a CUDA toolkit, providing access to the required GPU drivers.