Using local/scratch disk for jobs
The compute nodes access the user's data files via NFS over infiniband network.
When a job runs on the compute nodes with read or write operation on those data, these I/O is not only slow and unreliable comparing with local disk, it might also slow/bring down other critical resources when used heavily.
For this reason, it is recommended (and strongly recommended for heavy I/O runs) that you use the alternative local disk I/O. Evenmore, we implement tmpfs local scratch folder which store data in a portions of memory. Generally, I/O intensive tasks and programs that run frequent read/write operations can benefit from using this tmpfs folder.
Running on local disk is not as straightforward as running under nfs scratch and requires some care. At the least, you'll need to copy the input files from nfs disk to the local disk, cd there, run the job, and finally copy the output files back.
Job scratch directories
The local disk on all compute nodes is called /export/ramdisk. The /export/ramdisk on each compute node is a different tmpfs disk. For example, files written to /export/ramdisk on node node03 will not be visible on node node20.
The following tables show Tmpfs mount point and size on SCC cluster.
Node mount point tmpfs size
node01-node22 /export/ramdisk 100G
node23-node32 /export/ramdisk 256G
gpu01 /export/ramdisk 256G
Once a job is started in the queue, a scratch directory is created for the job on the local disk on which the job is running.
The name of this directory contains the queue job id, i.e., the number in which a job is identified in the queue (and seen in the output to qstat). For example, if a job has the job id of 150719, then a directory will be created on the compute node called:
/export/ramdisk/150719
This directory can be referred to as $TMPDIR in job submission scripts.
The owner of the job has write permissions in $TMPDIR to write their temporary files.
Job scratch directory policies
The scratch area is intended for the writing/reading of temporary files during the course of a job. At the termination of the job, temporary files are removed to prevent the scratch area being over-utilised and thus preventing other jobs from running on that node.
The scratch directory (hereafter called $TMPDIR) is created upon a job initiating within the queue. The directory exists for the duration of the job, and is then removed upon the job exiting the queue.
It is therefore important that users copy any output files that they require from $TMPDIR to a subdirectory within their $HOME directory before the job is completed. The following Sections contain guidelines to assist the users to this end.
Using a job scratch directory
$TMPDIR is used within a PBS job submission script. In the script, the necessary input files should be copied into$TMPDIR first, $TMPDIR should then become the current directory, and the job run from $TMPDIR.
For example, if a job has a single input file called input.data, an executable called run.x, and the job produces a single output file called output.log, then the necessary command lines to include in the PBS job submission script (ignoring the PBS directives) might be something like:
cp input.data $TMPDIR/ cp run.x $TMPDIR/ cd $TMPDIR ./run.x > output.log cp output.log $PBS_O_WORKDIR
where $PBS_O_WORKDIR is a variable which contains the directory location from which the job was originally submitted from (the original working directory, for example /genome/home/pat/work).
Request local scratch directory resource
The $TMPDIR have limited space in different node. To make sure your data can be put into the tmpfs folder. You should use file parameter in your pbs script something like:
#SBATCH --tmp=<size[units]>
Or define in command line
sbatch --tmp=<size[units]> test.pbs
Specify a minimum amount of temporary disk space per node. Default units are megabytes unless the SchedulerParameters configuration parameter includes the "default_gbytes" option for gigabytes. Different units can be specified using the suffix [K|M|G|T].
Using a shell trap to prevent loss of files
Sometimes a job can die unexpectedly, and before it can terminate successfully. In this scenario, the job might not be able to copy any required output from $TMPDIR back to $SLURM_SUBMIT_DIR, and the job would then have to be restarted from the beginning.
To prevent this from happening, users should include a shell trap to catch a terminating signal, and then copy the desired files back to $SLURM_SUBMIT_DIR.
For example, when considering the commands given in the above Section Using a job scratch directory, the following line should be added to prevent the loss of the output file myoutput in the event of a disaster:
cp input.data run.x $TMPDIR
cd $TMPDIR
run.x > output.log
cp output.log $SLURM_SUBMIT_DIR
trap "cp output.log $SLURM_SUBMIT_DIR" EXIT SIGTERM