...
Message Passing Interface (MPI) Jobs
The following example requests 24 tasks, each with a single core. It further specifies that these should be split evenly on 2 nodes, and within the nodes, the 12 tasks should be evenly split on the two sockets. So each CPU on the two nodes will have 6 tasks, each with its own dedicated core. The --distribution option will ensure that tasks are assigned cyclically among the allocated nodes and sockets. Please see the SchedMD sbatch documentation for more detailed explanations of each of the sbatch options below.
SLURM is very flexible and allows users to be very specific about their resource requests. Thinking about your application and doing some testing will be important to determine the best set of resources for your specific job.
cat mpi_mpirun.sh
#!/bin/bash
#SBATCH --job-name=mpi_job_test # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@camh.ca # Where to send mail. Set this to your email address
#SBATCH --ntasks=24 # Number of MPI tasks (i.e. processes)
#SBATCH --cpus-per-task=1 # Number of cores per MPI task
#SBATCH --nodes=2 # Maximum number of nodes to be allocated
#SBATCH --ntasks-per-node=12 # Maximum number of tasks on each node
#SBATCH --ntasks-per-socket=6 # Maximum number of tasks on each socket
#SBATCH --distribution=cyclic:cyclic # Distribute tasks cyclically first among nodes and then among sockets within a node
#SBATCH --mem-per-cpu=600mb # Memory (i.e. RAM) per processor
#SBATCH --time=00:05:00 # Wall time limit (days-hrs:min:sec)
#SBATCH --output=mpi_test_%j.log # Path to the standard output and error files relative to the working directory
...
module load bio/NEURON/7.8.2_LFPy-2.2_Python-3.8.5
mpirun /opt/scc/generic/examples/SLURM/helloWorldMPI
Array job
cat array_job.sl
#!/bin/bash #SBATCH --job-name=array_job_test # Job name #SBATCH --mail-type=FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=email@camh.ca # Where to send mail #SBATCH --ntasks=1 # Run a single task #SBATCH --mem=1gb # Job Memory #SBATCH --time=00:05:00 # Time limit hrs:min:sec #SBATCH --output=array_%A-%a.log # Standard output and error log #SBATCH --array=1-5 # Array range pwd; hostname; date echo This is task $SLURM_ARRAY_TASK_ID date
Note the use of %A for the master job ID of the array, and the %a for the task ID in the output filename.
GPU job
cat gputest.sl
#!/bin/bash #SBATCH --job-name=gputest #SBATCH --output=gputest.out #SBATCH --error=gputest.err #SBATCH --mail-type=ALL #SBATCH --mail-user=email@ufl.edu #SBATCH --nodes=1 #SBATCH --ntasks=8 #SBATCH --cpus-per-task=1 #SBATCH --ntasks-per-node=8 #SBATCH --distribution=cyclic:cyclic #SBATCH --mem-per-cpu=7000mb #SBATCH --partition=gpu #SBATCH --gpus:tesla:4 #SBATCH --time=00:30:00 module purge module load system/CUDA/10.2.89nvidia-smi
run with
sbatch -p gpu /opt/scc/generic/examples/SLURM/gputest.sl
Info |
---|
Related articles
Content by Label | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...