...
#!/bin/bash
#SBATCH --job-name=parallel_job_test # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@camh.ca # Where to send mail
#SBATCH --nodes=1 # Run all processes on a single node
#SBATCH --ntasks=4 # Number of processes
#SBATCH --mem=1gb # Total memory limit
#SBATCH --time=01:00:00 # Time limit hrs:min:sec
#SBATCH --output=multiprocess_%j.log # Standard output and error log
date;hostname;pwd
module load lang/Python
python /opt/scc/generic/examples/SLURM/python_openmp.py
date
Message Passing Interface (MPI) Jobs
PMIx Versions
When launching applications linked against our OpenMPI libraries via srun, you must specify the correct version of PMIx using the "--mpi" srun option. Generally speaking you can determine the appropriate PMIx version to use by running the ompi_info command after loading the desired OpenMPI environment module. For example,
The following example requests 24 tasks, each with a single core. It further specifies that these should be split evenly on 2 nodes, and within the nodes, the 12 tasks should be evenly split on the two sockets. So each CPU on the two nodes will have 6 tasks, each with its own dedicated core. The --distribution option will ensure that tasks are assigned cyclically among the allocated nodes and sockets. Please see the SchedMD sbatch documentation for more detailed explanations of each of the sbatch options below.
SLURM is very flexible and allows users to be very specific about their resource requests. Thinking about your application and doing some testing will be important to determine the best set of resources for your specific job.
cat mpi_mpirun.sh
#!/bin/bash
#SBATCH --job-name=mpi_job_test # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=email@camh.ca # Where to send mail. Set this to your email address
#SBATCH --ntasks=24 # Number of MPI tasks (i.e. processes)
#SBATCH --cpus-per-task=1 # Number of cores per MPI task
#SBATCH --nodes=2 # Maximum number of nodes to be allocated
#SBATCH --ntasks-per-node=12 # Maximum number of tasks on each node
#SBATCH --ntasks-per-socket=6 # Maximum number of tasks on each socket
#SBATCH --distribution=cyclic:cyclic # Distribute tasks cyclically first among nodes and then among sockets within a node
#SBATCH --mem-per-cpu=600mb # Memory (i.e. RAM) per processor
#SBATCH --time=00:05:00 # Wall time limit (days-hrs:min:sec)
#SBATCH --output=mpi_test_%j.log # Path to the standard output and error files relative to the working directory
echo "Date = $(date)"
echo "Hostname = $(hostname -s)"
echo "Working Directory = $(pwd)"
echo ""
echo "Number of Nodes Allocated = $SLURM_JOB_NUM_NODES"
echo "Number of Tasks Allocated = $SLURM_NTASKS"
echo "Number of Cores/Task Allocated = $SLURM_CPUS_PER_TASK"
module load bio/NEURON/7.8.2_LFPy-2.2_Python-3.8.5
mpirun /opt/scc/generic/examples/SLURM/helloWorldMPI
Array job
cat array_job.sl
#!/bin/bash #SBATCH --job-name=array_job_test # Job name #SBATCH --mail-type=FAIL$ module load bio/NEURON/7.6.7_Python-2.7.17 $ ompi_info --param pmix all # MCA pmix: fluxMail events (MCA v2.1.0NONE, BEGIN, API v2.0.0, Component v3.1.4)
END, FAIL, ALL) #SBATCH --mail-user=email@camh.ca # Where to send mail #SBATCH --ntasks=1 MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.1.4)
# Run a single task #SBATCH --mem=1gb MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.1.4)
$ ml purge $ module load bio/NEURON/7.8.2_LFPy-2.2_Python-3.8.5 $ ompi_info --param pmix all # Job Memory #SBATCH --time=00:05:00 # Time limit hrs:min:sec #SBATCH --output=array_%A-%a.log # Standard MCAoutput pmix:and flux (MCA v2.1.0, API v2.0.0, Component v4.1.0)
error log #SBATCH --array=1-5 # Array MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.0)
MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component v4.1.0)
In the examples above, you would specify pmix_v2 (i.e. ext2x) for the bio/NEURON/7.6.7_Python-2.7.17 and pmix_v3 (ext3x) for the second set of modules.
...
range pwd; hostname; date echo This is task $SLURM_ARRAY_TASK_ID date
Note the use of %A for the master job ID of the array, and the %a for the task ID in the output filename.
GPU job
cat gputest.sl
#!/bin/bash #SBATCH --job-name=gputest #SBATCH --output=gputest.out #SBATCH --error=gputest.err #SBATCH --mail-type=ALL #SBATCH --mail-user=email@ufl.edu #SBATCH --nodes=1 #SBATCH --ntasks=8 #SBATCH --cpus-per-task=1 #SBATCH --ntasks-per-node=8 #SBATCH --distribution=cyclic:cyclic #SBATCH --mem-per-cpu=7000mb #SBATCH --time=00:30:00 module purge module load system/CUDA/10.2.89nvidia-smi
run with
sbatch -p gpu /opt/scc/generic/examples/SLURM/gputest.sl
Info |
---|
Related articles
Content by Label | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...