Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

#!/bin/bash
#SBATCH --job-name=parallel_job_test # Job name
#SBATCH --mail-type=END,FAIL         # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@camh.ca    # Where to send mail	
#SBATCH --nodes=1                    # Run all processes on a single node	
#SBATCH --ntasks=4                   # Number of processes
#SBATCH --mem=1gb                    # Total memory limit
#SBATCH --time=01:00:00              # Time limit hrs:min:sec
#SBATCH --output=multiprocess_%j.log # Standard output and error log
date;hostname;pwd

module load lang/Python

python /opt/scc/generic/examples/SLURM/python_openmp.py

date

Message Passing Interface (MPI) Jobs

PMIx Versions

When launching applications linked against our OpenMPI libraries via srun, you must specify the correct version of PMIx using the "--mpi" srun option. Generally speaking you can determine the appropriate PMIx version to use by running the ompi_info command after loading the desired OpenMPI environment module. For example,

The following example requests 24 tasks, each with a single core. It further specifies that these should be split evenly on 2 nodes, and within the nodes, the 12 tasks should be evenly split on the two sockets. So each CPU on the two nodes will have 6 tasks, each with its own dedicated core. The --distribution option will ensure that tasks are assigned cyclically among the allocated nodes and sockets. Please see the SchedMD sbatch documentation for more detailed explanations of each of the sbatch options below.

SLURM is very flexible and allows users to be very specific about their resource requests. Thinking about your application and doing some testing will be important to determine the best set of resources for your specific job.


cat mpi_mpirun.sh

#!/bin/bash
#SBATCH --job-name=mpi_job_test # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@camh.ca # Where to send mail. Set this to your email address
#SBATCH --ntasks=24 # Number of MPI tasks (i.e. processes)
#SBATCH --cpus-per-task=1 # Number of cores per MPI task
#SBATCH --nodes=2 # Maximum number of nodes to be allocated
#SBATCH --ntasks-per-node=12 # Maximum number of tasks on each node
#SBATCH --ntasks-per-socket=6 # Maximum number of tasks on each socket
#SBATCH --distribution=cyclic:cyclic # Distribute tasks cyclically first among nodes and then among sockets within a node
#SBATCH --mem-per-cpu=600mb # Memory (i.e. RAM) per processor
#SBATCH --time=00:05:00 # Wall time limit (days-hrs:min:sec)
#SBATCH --output=mpi_test_%j.log # Path to the standard output and error files relative to the working directory

echo "Date = $(date)"
echo "Hostname = $(hostname -s)"
echo "Working Directory = $(pwd)"
echo ""
echo "Number of Nodes Allocated = $SLURM_JOB_NUM_NODES"
echo "Number of Tasks Allocated = $SLURM_NTASKS"
echo "Number of Cores/Task Allocated = $SLURM_CPUS_PER_TASK"

module load bio/NEURON/7.8.2_LFPy-2.2_Python-3.8.5
mpirun /opt/scc/generic/examples/SLURM/helloWorldMPI

Array job

cat array_job.sl

#!/bin/bash
#SBATCH --job-name=array_job_test   # Job name
#SBATCH --mail-type=FAIL$ module load bio/NEURON/7.6.7_Python-2.7.17
$ ompi_info --param pmix all
                # MCA pmix: fluxMail events (MCA v2.1.0NONE, BEGIN, API v2.0.0, Component v3.1.4)
END, FAIL, ALL) #SBATCH --mail-user=email@camh.ca # Where to send mail #SBATCH --ntasks=1 MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.1.4)
# Run a single task #SBATCH --mem=1gb MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.1.4)
$ ml purge $ module load bio/NEURON/7.8.2_LFPy-2.2_Python-3.8.5 $ ompi_info --param pmix all # Job Memory #SBATCH --time=00:05:00 # Time limit hrs:min:sec #SBATCH --output=array_%A-%a.log # Standard MCAoutput pmix:and flux (MCA v2.1.0, API v2.0.0, Component v4.1.0)
error log #SBATCH --array=1-5 # Array MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.0)
MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component v4.1.0)

In the examples above, you would specify pmix_v2 (i.e. ext2x) for the bio/NEURON/7.6.7_Python-2.7.17   and pmix_v3 (ext3x) for the second set of modules.

...

range
pwd; hostname; date

echo This is task $SLURM_ARRAY_TASK_ID
date

Note the use of %A for the master job ID of the array, and the %a for the task ID in the output filename.

GPU job

cat gputest.sl

#!/bin/bash
#SBATCH --job-name=gputest
#SBATCH --output=gputest.out
#SBATCH --error=gputest.err
#SBATCH --mail-type=ALL
#SBATCH --mail-user=email@ufl.edu
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --ntasks-per-node=8
#SBATCH --distribution=cyclic:cyclic
#SBATCH --mem-per-cpu=7000mb
#SBATCH --time=00:30:00

module purge
module load system/CUDA/10.2.89
nvidia-smi

run with
sbatch -p gpu /opt/scc/generic/examples/SLURM/gputest.sl





Info

Content by Label
showLabelsfalse
max5
spacesSCC
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel = "kb-how-to-article" and type = "page" and space = "SCC"
labelskb-how-to-article

...