From 91fdaef43e7bec955b2c7ec0e54dca9df188330a Mon Sep 17 00:00:00 2001 From: Michele Simionato Date: Wed, 2 Oct 2024 15:57:39 +0200 Subject: [PATCH] Fixed SLURM docs --- .../installation-instructions/slurm.md | 97 +++++-------------- 1 file changed, 22 insertions(+), 75 deletions(-) diff --git a/doc/getting-started/installation-instructions/slurm.md b/doc/getting-started/installation-instructions/slurm.md index 38d84fe0b1e4..25ab297b20b7 100644 --- a/doc/getting-started/installation-instructions/slurm.md +++ b/doc/getting-started/installation-instructions/slurm.md @@ -4,7 +4,8 @@ Most HPC clusters support a scheduler called SLURM ( Simple Linux Utility for Resource Management). The OpenQuake engine -is able to transparently interface with SLURM. +is able to transparently interface with SLURM, thus making it possible +to run a single calculation over multiple nodes of the cluster. ## Running OpenQuake calculations with SLURM @@ -27,7 +28,7 @@ which will split the calculation over 4 nodes. Clearly, there are limitations on the number of available nodes, so if you set a number of nodes which is too large you can have one of the following: -1. an error "You can use at most N nodes"; N depends on the +1. an error "You can use at most N nodes", where N depends on the configuration chosen by your system administrator and can be inferred from the parameters in the openquake.cfg file as `max_cores / num_cores`; for instance for `max_cores=1024` and `num_cores=128` you would have `N=8` @@ -42,19 +43,19 @@ of nodes which is too large you can have one of the following: `Resources` (waiting for resources to become available) or `Priority` (queued behind a higher priority job). -If you are stuck in situation 2 you must kill the openquake job and the +If you are stuck in situation 2 you must kill the SLURM job with the command `scancel JOBID` (JOBID is listed by the command `$ squeue -u $USER`). If you are stuck in situation 3 for a long -time it can be better to kill the jobs (both openquake and SLURM) and -then relaunch the calculations, this time asking for fewer nodes. +time it can be better to kill the job and +then relaunch the calculation, this time asking for fewer nodes. ## Running out of quota The engine will store the calculation files in `shared_dir` and some auxiliary files in `custom_dir`; both directories are mandatory and must be specified in the configuration file. The -`shared_dir` is meant to point to the work area of the cluster -and the `custom_tmp` to the scratch area of the cluster. +`shared_dir` should be locateded in the work area of the cluster +and the `custom_tmp` in the scratch area of the cluster. Classical calculations will generate an .hdf5 file for each task spawned, so each calculation can spawn thousands of files. @@ -66,23 +67,23 @@ old calculations, which will have the form `scratch_dir/calc_XXX`. This section is for the administrators of the HPC cluster. Here are the installations instructions to create modules for -engine 3.18 assuming you have python3.10 installed as modules. +engine 3.21 assuming you have python3.11 installed as modules. We recommend choosing a base path for openquake and then installing -the different versions using the release number, in our example /apps/openquake/3.18. +the different versions using the release number, in our example /apps/openquake/3.21. This will create different modules for different releases ``` -# module load python/3.10 +# module load python/3.11 # mkdir /apps/openquake -# python3.10 -m venv /apps/openquake/3.18 -# source /apps/openquake/3.18/bin/activate +# python3.11 -m venv /apps/openquake/3.21 +# source /apps/openquake/3.21/bin/activate # pip install -U pip -# pip install -r https://github.com/gem/oq-engine/raw/engine-3.18/requirements-py310-linux64.txt -# pip install openquake.engine==3.18 +# pip install -r https://github.com/gem/oq-engine/raw/engine-3.21/requirements-py310-linux64.txt +# pip install openquake.engine==3.21 ``` Then you have to define the module file. In our cluster it is located in -`/apps/Modules/modulefiles/openquake/3.18`, please use the appropriate +`/apps/Modules/modulefiles/openquake/3.21`, please use the appropriate location for your cluster. The content of the file should be the following: ``` #%Module1.0 @@ -93,10 +94,10 @@ proc ModulesHelp { } { puts stderr "\n\tThis will add OpenQuake to your PATH environment variable." } -module-whatis "loads the OpenQuake 3.18 environment" +module-whatis "loads the OpenQuake 3.21 environment" -set version 3.18 -set root /apps/openquake/3.18 +set version 3.21 +set root /apps/openquake/3.21 prepend-path LD_LIBRARY_PATH $root/lib64 prepend-path MANPATH $root/share/man @@ -110,7 +111,7 @@ After installing the engine, the sysadmin has to edit the file [distribution] oq_distribute = slurm serialize_jobs = 2 -python = /apps/openquake/3.18/bin/python +python = /apps/openquake/3.21/bin/python [directory] # optionally set it to something like /mnt/large_shared_disk @@ -119,61 +120,7 @@ shared_dir = [dbserver] host = local ``` -With `serialize_jobs = 2` at most two jobs per user can run concurrently. You may want to +With `serialize_jobs = 2` at most two jobs per user can be run concurrently. You may want to increase or reduce this number. Each user will have its own database located in `$HOME/oqdata/db.sqlite3`. The database will be created automatically -the first time the user runs a calculation (NB: in engine-3.18 it must be -created manually with the command `srun oq engine --upgrade-db --yes`). - -## How it works internally - -The support for SLURM is implemented in the module -`openquake/baselib/parallel.py`. The idea is to submit to SLURM a job -array of tasks for each parallel phase of the calculation. For instance -a classical calculations has three phases: preclassical, classical -and postclassical. - -The calculation will start sequentially, then it will reach the -preclassical phase: at that moment the engine will create a -bash script called `slurm.sh` and located in the directory -`$HOME/oqdata/calc_XXX` being XXX the calculation ID, which is -an OpenQuake concept and has nothing to do with the SLURM ID. -The `slurm.sh` script has the following template: -```bash -#!/bin/bash -#SBATCH --job-name={mon.operation} -#SBATCH --array=1-{mon.task_no} -#SBATCH --time=10:00:00 -#SBATCH --mem-per-cpu=1G -#SBATCH --output={mon.calc_dir}/%a.out -#SBATCH --error={mon.calc_dir}/%a.err -srun {python} -m openquake.baselib.slurm {mon.calc_dir} $SLURM_ARRAY_TASK_ID -``` -At runtime the `mon.` variables will be replaced with their values: - -- `mon.operation` will be the string "preclassical" -- `mon.task_no` will be the total number of tasks to spawn -- `mon.calc_dir` will be the directory `$HOME/oqdata/calc_XXX` -- `python` will be the path to the python executable to use, as set in openquake.cfg - -System administrators may want to adapt such template. At the moment -this requires modifying the engine codebase; in the future the template -may be moved in the configuration section. - -A task in the OpenQuake engine is simply a Python function or -generator taking some arguments and a monitor object (`mon`), -sending results to the submitter process via zmq. - -Internally the engine will save the input arguments for each task -in pickle files located in `$HOME/oqdata/calc_XXX/YYY.pik`, where -XXX is the calculation ID and YYY is the `$SLURM_ARRAY_TASK_ID` starting from 1 -to the total number of tasks. - -The command `srun {python} -m openquake.baselib.slurm {mon.calc_dir} -$SLURM_ARRAY_TASK_ID` in `slurm.sh` will submit the tasks in parallel -by reading the arguments from the input files. - -Using a job array has the advantage that all tasks can be killed -with a single command. This is done automatically by the engine -if the user aborts the calculation or if the calculation fails -due to an error. +the first time the user runs a calculation.