Overview
The Specialized Computing Cluster (SCC) and its User Polices are managed by the SCC Committee with an aim to
- Make the system effective for all users
- Ensure proper maintenance of the system,
- Document use of the system, and
- Enable researchers to contribute to the system.
User Policies are continually reviewed and updated by the SCC Committee. Questions or concerns should be e-mailed to scc_support@camh.net.
It is critical that all SCC users have access to the cluster system and that the integrity of all users’ code is maintained. Following some of these policies is likely to require some understanding of the nature of HPC computing and the configuration of the hardware and software on the system. Key information regarding the SCC architecture is available through the SCC wiki.
Please contact scc_support@camh.net to ask questions or to report problems.
Obtaining an Account
Individual can apply for accounts on the SCC, and are available in three primary categories:
- CAMH Research Faculty
Any CAMH research faculty whose project has computational needs that are beyond the capacity of his or her current workstation may apply for an account. (See PI Account Form.) The PI account can be shared with members of the research team, who have completed a Researcher Account Form.
- CAMH Researcher
Researchers who are sponsored by a CAMH PI (i.e students, undergrad or grad, research assistants .etc.) may apply for an account. Confirmation from the supervising PI is required to process Researcher accounts. (See Researcher Account Form.)
PI and Researcher Account forms are made available on the wiki or by request to scc_support@camh.net. After submission proposals will be evaluated by the SCC Committee. Approved user accounts may take 1-2 business days to process and both users (and their supervising PI if applicable) will be notified via email after completion.
User Policies
The users of specialized computing cluster (SCC) computing resource must abide by all CAMH computing policies. The SCC is a shared resource; if a user does not abide by the User Policy, a representative of the SCC Committee has the right to terminate the user’s jobs and/or to suspend the user's account. Please note that the User Policies are continually reviewed and updated by the SCC Committee and are therefore subject to change.
- Communication
Users must monitor the e-mail address on file with the SCC Committee and are required to respond in a timely way to correspondence from the SCC Committee. Users will be notified by e-mail about issues related to the system, such as scheduled downtime.
- Account Security
No one is permitted to share his or her SCC account with any other person, in any way. Users must not give their passwords to a friend, allow remote or password-free logins to their accounts, or permit another individual to use their account after personally logging in.
- Data De-Identification
All data should be de-identified before being migrated to the SCC. It is the user's responsibility to ensure the confidentiality of any data you used on the SCC. It is your responsibility to be aware of any requirements to perform de-idenfication for any particular data set.
- Disk Usage
The SCC is a shared resource with finite storage space. As such strict limits are imposed with regards to alloted storage quotas and backup.
- Home Disk Space
All SCC users are provided with a 5GB home directory (/home/GROUP/USER) that is visible on login to the head node. However on the compute nodes the home directory is mounted as read-only, that is jobs can read from /home but cannot write files there. The contents of the /home directory are backed up regularly. The home directory should only be used for compiling programs. If for some reason a user feels that they will require more space in thier /home directory they may contact their administrator camh.scc@gmail.com
- Scratch Disk Space
Each PI account holder is provided with a large storage space (/scratch) for input and output for jobs. Research users under the PI usergroup can access the shared temporary /scratch directory. The size of the scratch space is negotiated with the CAMH PI based on the size of input and output and computing needs. This scratch space is not backed up. Furthermore data that has not been accessed in more than 3 months will be deleted.
- Backup Policy
Backups of the /home directory are run daily Monday-Friday. We will store up to two weeks worth of backups at any one time. Note that only the contents of /home directories will be backed up.
- Data Restoration
Data can be restored from the backup media by request in the event that original copies have been deleted. To request a restore send filename(s) and the date for rollback to camh.scc@gmail.com. Please note that backups older than one week are stored off site, such that data retrieval by request may take up to 24 hours depending on rollback date.
- Data Transfer
All traffic to and from the SCC must go via SSH, or secure shell. Once a secure connection has been established data transfer is best achieved using scp, or rsync.
- Software
Because different applications may require different version of libraries or associated software, the SCC includes Module software that allows modification of the user environment to refer to particular software versions. Every attempt is made to keep the multiple software suites running.
Computational Resource Limit Policy
- Restrictions on the Number of Processes
The Number of Processes user can run is only limite by hardware resource . SCC use fair shared policy to set priority of your job by Slurm. Jobs that are submitted by the user will wait in the queue until resource is availlable before running .
- Restrictions on Completion Time
The SCC is a shared system and jobs that are run on it are submitted to a queue. The scheduler process the jobs to make best use of the available computing resources and fair policy rules.
If a user needs more help on computing resources, the user can contact scc_support@camh.net and justify the request.
Availability and Planned Maintenance
General Availability Every reasonable effort will be made to keep the SCC available and operational 24 hours per day and 7 days per week. Please note however that although the support personnel and administrator will do their best to keep the servers running at all times, we cannot guarantee to promptly resolve problems outside of office hours, and during weekends and public holidays. Nevertheless, please notify scc_support@camh.net of issues whenever they arise. Planned maintenance
Occasionally it is necessary as part of maintaining a reliable service to update system software and replace faulty hardware. Sometimes it will be possible to perform these tasks transparently by means of queue reconfiguration in a way that will not disrupt running jobs or interactive use, or significantly inconvenience users. Some tasks however, particularly those affecting storage or login nodes, may require temporary interruption of service.
Where possible, maintenance activities involving a level of disruption to service will be scheduled on:
Thursdays: 15:00-20:00
Please note that this does not mean that there will be disruption at this time every week, merely that if potentially disruptive maintenance is necessary we will do our best to ensure it takes place during this period, in which case there will be advance notification. Establishing a predictable time slot for planned maintenance has the advantage that users may be confident that `dangerous' changes will not intentionally be undertaken at other times. Unfortunately the potential for unplanned periods of disruption is a fact of life - please see the next section. Exceptional maintenance and unplanned disruptions
It may happen that despite best efforts, it becomes necessary to reduce or withdraw service at short notice and/or outside the planned maintenance time slot. This may happen e.g. for environmental reasons, such as air conditioning or power failure, or in an emergency where immediate shutdown is required to save equipment or data. It is hoped that these situations will arise rarely. Obviously in such cases service will be restored as rapidly as possible.