Skip to content

Computing

Brian M. Schilder edited this page Sep 8, 2022 · 10 revisions

Learning to use the computing cluster:

To get access to the computing cluster, send Nathan Skene an email with your username and he will add you.

The Imperial CX1 cluster uses PBS as a job manager. PBS has many versions and it cannot neccesarily be assumed that a function you find in an online manual will work exactly as described there. The functions which work on the cluster are best found by typing man qstat while logged to an interactive session.

There is a weekly HPC clinic. I strongly recommend making use of this. You can turn up and experts will help you. Even if you are just unsure about something, go and speak to them. They are held in South Kensington but it's well worth going.

Imperial regularly runs a beginner's guide to high performance computing course. If you have not previously used HPC you'll want to register for this as soon as possible. This can be done through their website:

Imperial also runs a course on software carpentry. If you are unfamiliar with usage of Git and Linux then you should take this course.

Combiz Khozoie wrote useful notes on using the HPC (how to login etc):

Job sizing requirements

The job sizing requirements for Imperial's HPC have changed over time, so please see here for the most updated version: https://wiki.imperial.ac.uk/pages/viewpage.action?spaceKey=HPC&title=New+Job+sizing+guidance

Rstudio On-Demand

You can request/launch Rstudio containers with a certain amount of resources using HPC's On-Demand service here: https://openondemand.rcs.ic.ac.uk/

For further documentation, please see here: https://wiki.imperial.ac.uk/display/HPC/Open+OnDemand

Logging into the cluster

VPN into the network, then connect with

ssh <YOUR_USERNAME>@login.hpc.ic.ac.uk

Setup ssh for the cluster

To setup ssh on your computer for accessing the cluster add the following to ~/.ssh/config. Make sure to replace all places marker <YOUR_USERNAME> with your actual Imperial HPC username (without the "< >"):

Host *
 AddKeysToAgent yes
 IdentityFile ~/.ssh/id_rsa
Host imperial
   User <YOUR_USERNAME>
   AddKeysToAgent yes
   HostName login.cx1.hpc.imperial.ac.uk
   ForwardX11Trusted yes
   ForwardX11 yes
   HostKeyAlgorithms=+ssh-dss
Host imperial-7
   User <YOUR_USERNAME>
   AddKeysToAgent yes
   HostName login-7.cx1.hpc.imperial.ac.uk
   ForwardX11Trusted yes
   ForwardX11 yes
   HostKeyAlgorithms=+ssh-dss

If you use imperial-7 to login then you'll always connect to the same login node which makes using screen/tmux easier.

Saving your password

Rather than entering your password every time you ssh login, you can simply use ssh-keygen to do this once:

ssh-copy-id <YOUR_USERNAME>@login.hpc.ic.ac.uk

This will ask you for your HPC password. Afterwards you will no longer have to enter your password from your local computer. For a more detailed explanation, see here.

Mounting the HPC on your local computer

You can mount HPC on your local computer, allowing you to easily browse, edit, and import anything that is stored on HPC. You can even open R projects and use your local Rstudio to edit and run code.

Caveats

Note that this will be slower than running the same code directly on the HPC, as all the data has to be first transferred from HPC to your local computer every time you run a command. Also, all processes will be executed on your machine, meaning you do not have direct access to HPC's compute power through this approach (unless you're submitting pbs job scripts).

Set this up using the following steps:

MacOS

  1. On your Desktop, use the hotkey command CMD+K to open the following Connect to Server window.
Screenshot 2022-09-06 at 11 32 45
  1. Enter the following: smb://rds.imperial.ac.uk/RDS/user/<YOUR_USERNAME>
  2. Hit ENTER and the HPC will now be mounted to your local computer at Volumes/<YOUR_USERNAME>.

Linux

You can add the following command to your ~/.bashrc so that it will be available to you whenever you open a new terminal. To change where you want to mount the directory to, simply change the last argument: /home/$USER/RDS --> /<YOUR PREFERRED_DIR>/$USER/RDS

alias mountrds='sudo mount -t cifs -o uid=$USER -o gid=emailonly -o username=$USER //rds.imperial.ac.uk/RDS /home/$USER/RDS'

Then, you can simply run the following to mount your HPC folder:

mountrds

Installing software on the cluster

If you want an R package installed and there's something that needs root permissions to setup, then request that it be installed via ASK.

You might want to try joining RCS Slack and raising the issue there.

Before doing either of these, you might want to check if the software is listed under module avail. If it is, you might be able to access it within RStudio using the RLinuxModules package. Another option is to setup the software up within conda, but that won't help with RStudio.

Creating interactive jobs on the cluster

To create an interactive session on the cluster (to avoid overloading the login nodes) use the following command

qsub -I -l select=1:ncpus=8:mem=96gb -l walltime=08:00:00

That command requests the most resources that can be obtained for interactive jobs, decrease these if you can. On the main queue it will take a long time to submit. If I've given you access to the med-bio queue (ask) then you will be better of using the following:

qsub -I -l select=1:ncpus=2:mem=8gb -l walltime=01:00:00 -q med-bio

Here are the current allowed job sizing parameters for Imperials' HPC:

Screenshot 2022-09-06 at 11 23 04

Joint workspaces on the Imperial cluster

We have two shared project spaces on the cluster. If you are involved in the DRI Multiomics Atlas project then use projects/ukdrimultiomicsprojects/. Otherwise, please use projects/neurogenomics-lab

You will not be able to write into the main directory of either of these. They have two folders: live and ephemeral. Read about the differences between these here:

The medbio cluster (for faster job submissions)

What is medbio?

The MedBio cluster has additional computational resources and is accessed via a separate queue. Read about it here: https://www.imperial.ac.uk/bioinformatics-data-science-group/resources/uk-med-bio/

Access

We have access to it but I do not have admin rights to grant access to individuals.

To get access it was previously necessary to email [email protected] but he is no longer at Imperial, and the access management doesn't seem to have been cleared up. Another contact email is [email protected]. Most recently, Brian got access by emailing [email protected].

Nathan Skene can access a list of users that currently have admin rights over medbio through the self-service portal. Ask him to check who is on there currently, and then contact one of them. Currently Abbas Dehghan seems like a good candidate.

Usage

To run on the med bio cluster, just put this at the end of your submit commands: -q med-bio

Express (charged) access to the cluster

It can take a long time to get jobs submitted on the cluster. We can pay to get jobs submitted faster. Details are here: https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/computing/express-access/. I would rather pay for faster results if this is slowing you down. Let me know if this would be useful and I'll add you to the list. Express jobs are sunmitted using Run express jobs with qsub -q express -P exp-XXXXX, substituting your express account code.

Running docker containers on the HPC with singularity

Creating a singularity container from a docker container

You'll need to use singularity to run docker containers on the HPC. To run a rocker R container in interactive mode, run the following, substituting your username where appropriate:

mkdir /rds/general/user/$USER/ephemeral/tmp/
singularity exec -B /rds/general/user/$USER/ephemeral/tmp/:/tmp,/rds/general/user/$USER/ephemeral/tmp/:/var/tmp,/rds/general/user/$USER/ephemeral/rtmp/:/usr/local/lib/R/site-library/ --writable-tmpfs docker://rocker/tidyverse:latest R

To create a Singularity image, first archive the image into a tar file. Obtain the IMAGE_ID with docker images then archive with (substituting the IMAGE_ID): -

docker save 409ad1cbd54c -o singlecell.tar

On a system running singularity-container (>v3) (e.g. on the HPC cluster), generate the Singularity Image File (SIF) from the local tar file with: -

/usr/bin/singularity build singlecell.sif docker-archive://singlecell.tar

This singlecell.sif Singularity Image File (SIF) is now ready to use.

Running a singularity container

On HPC, Rocker containers can be run through Singularity with a single command much like the native Docker commands, e.g.

singularity exec docker://rocker/tidyverse:latest R

By default singularity bind mounts /home/$USER, /tmp, and $PWD into your container at runtime.

More info is available here.

Extending wall time for jobs that are already running

If you go to this link in the compute tab there is your jobs. For some jobs you can extend the walltime.

Clone this wiki locally