-
Notifications
You must be signed in to change notification settings - Fork 2
Long term storage
- Last modified: fre sep 27, 2024 11:02
- Sign: JN
- Tested on: Xubuntu 22.04
- Solved: Yes
NRM provides a server, nrmdna01.nrm.se
where users can store data for "a
longer term". Primarily, the server is intended for raw sequencing data (such
as fastq-files). Exactly what "longer-term storage" means is probably not
decided, but if you need to plan ahead, it would be good to ask the head of
research (FA) at NRM for answers.
To gain access to these server, you need to send an email to NRM-IT (mailto:[email protected]) saying "I need access to the backup server".
Standard SSH access can be made using SSH-keys, and you need to provide your
public ed25519-key to NRM-IT (see also
SSH). Direct
access to the servers are, however, not currently allowed unless you either use
a NRM MS Windows computer with special permissions, or if you access directly
from rackham.uppmax.uu.se
, or dardel.pdc.kth.se
. Rumour has it that you
may also get drag-and-drop access from NRM MS Windows if behind the NRM
firewall -- if you ask for it. Again, ask NRM-IT for details.
Once logged in to nrmdna01.nrm.se, you can now create your own personal folder
in /projects/NRMDEPARTMENT-projects/NRMUSER
. For example, for NRMUSER johanyla
working at the BIO
department:
$ mkdir /projects/BIO-projects/johanyla
When transferring data between computers, you can, from a command-line perspective, either "push" or "pull". That is you can copy to ("push") or copy from ("pull"). If you want to copy a file from a cluster to nrmddna01.nrm.se, you either start the process on the cluster (using a command to the SLURM-queue system for long-running tasks), or you start the process on the nrmdna01.nrm.se server.
In general, I would recommend starting the copying process from the nrmdna01 server, an "pull" the data from any of the servers.
Given that you have access with an SSH-key to the server from nrmdna01.nrm.se, here is an example utilizing screen and rsync
nrmuser@nrmdna01:~$ screen -S name_for_the_session
nrmuser@nrmdna01:~$ rsync -avhP [email protected]:/path/to/folder/on/server /path/to/folder/on/nrmdna01
Then detach from the screen session (Ctrl+A, Ctrl+D). The later you can attach to the session:
nrmuser@nrmdna01:~$ screen -R name_for_the_session
And if all is good, exit the screen session
nrmuser@nrmdna01:~$ exit
Note: For thorough checking of transfer success, you may want to do some
checks of file integrity. This can be done directly in rsync
by adding the
option -c
or --checksum
(which may add considerable extra time), or
utilizing MD5SUMS prior and after transfer (see for example
https://github.com/nylander/Check_MD5SUMS).
Note 2: rsync
have a lot of
options. One important detail to keep in mind is the
trailing forward-slash (/
) on the source: If you add it after a source
folder, only the folder content is transferred. If you leave it out, the folder with
its content is transferred!
Note 3: If your transferred files are considered as a backup, it's a good idea to make one extra step to prevent them from being accidentally removed (either by you but also by someone else). Assume you transferred folder "data123" to your personal folder on nrmdna01, then apply this command:
$ chmod -R -w data123
This will remove write (and delete) permissions on the folder and all files in it.
To restore permissions, use chmod -R +w data123
.
If, on the other hand, you want to "push" data from the server to nrmdna01.nrm.se, you basically reverse the rsync-command above, and wrap it in a slurm script.
Here is an example for transferring the folder "folder" from dardel.pdc.kth.se to
the user project folder on nrmdna01.nrm.se (e.g. /projects/BIO-projects/NRMUSER
).
Note that the folder NRMUSER
(e.g. johanyla
) needs to be present.
#!/bin/bash -l
# File: rsync-to-nrm.slurm.sh
# Slurm script example for rsync from dardel to nrmdna01.
# Test by using
# sbatch --test-only rsync-to-nrm.slurm.sh
# Start by using
# sbatch rsync-to-nrm.slurm.sh
# Stop by using
# scancel 1234
# scancel -i -u $USER
# scancel --state=pending -u $USER
# Monitor by using
# squeue -u $USER
#SBATCH -J rsync-to-nrm
#SBATCH -A snic1234-5-678
#SBATCH -t 01:00:00
#SBATCH -p shared
#SBATCH -c 1
#SBATCH --output=rsync-to-nrm.log
rsync -avhP /cfs/klemming/path/to/folder [email protected]:/projects/NRMDEPARTMENT-projects/NRMUSER