The Research cluster is used to support postgraduate research students and research staff.
The Research cluster has 163 GPUs: 95x NVIDIA RTX 2080 Ti 11GB in 12 servers (damnii01-12), 28x A40 48GB in 7 nodes (crannog01-07), 32x L40 48GB across 8 servers (scotia01-08), and 8x H200 141GB in 1 node (herman). Please note that the configuration may change between now and 1/1/26. Check the MOTD and the blog for details.
How to get access
PGR Students on the appropriate programmes should get access automatically. if you can't access the nodes please submit an RT ticket.
How to use the cluster
Use of this cluster is controlled by Slurm. First ssh to a head node (mlp, mlp1 or mlp2) then use Slurm commands.
Connections to the cluster are restricted by the informatics firewall. You can connect to the head nodes from the SSH gateways, other servers inside the Informatics firewall, DICE desktops and when using the School's OpenVPN..
Here's how to use a cluster without breaking it:
When submitting jobs with srun make sure to use the -p PGR-Standard option so that you use the PGR specific cluster nodes.
Files and backups
- Jobs on the compute nodes cannot access your files in AFS. Instead there are home directories on a local distributed filesystem. You can copy files from AFS by logging in to a head node then using the cp command. AFS home directories can be found under
/afs/inf.ed.ac.uk/user/. - There is no disk quota but space is limited so please delete files you have finished with.
- Your files are NOT backed up. DO NOT USE THIS FILESYSTEM AS THE ONLY STORAGE FOR IMPORTANT DATA. Disk space will be reclaimed once your access has finished. It is your responsibility to make copies of anything you wish to keep. There is some redundancy for disaster recovery.
- The current distributed filesystem is Lustre (wikipedia).
Software
The cluster runs the Ubuntu Focal version of DICE. If you would like software to be added, please ask Computing Support.