Skip to content

Commit

Permalink
Merge branch 'develop' into perlmutter-offload-recipe
Browse files Browse the repository at this point in the history
  • Loading branch information
prckent authored Jan 14, 2025
2 parents c0a6d13 + 74534a6 commit 8872b17
Showing 1 changed file with 7 additions and 184 deletions.
191 changes: 7 additions & 184 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -822,89 +822,19 @@ package. This was successfully tested under OS X 10.15.7 "Catalina" on October 2

ctest -R deterministic

Installing on ALCF Theta, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Theta is a 9.65 petaflops system manufactured by Cray with 3,624 compute nodes.
Each node features a second-generation Intel Xeon Phi 7230 processor and 192 GB DDR4 RAM.

::

export CRAYPE_LINK_TYPE=dynamic
module load cmake/3.20.4
module unload cray-libsci
module load cray-hdf5-parallel
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
export BOOST_ROOT=/soft/libraries/boost/1.64.0/intel
cmake -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment ..
make -j 24
ls -l bin/qmcpack

Installing on ALCF Polaris
~~~~~~~~~~~~~~~~~~~~~~~~~~
Polaris is a HPE Apollo Gen10+ based 44 petaflops system.
Each node features a AMD EPYC 7543P CPU and 4 NVIDIA A100 GPUs.
A build recipe for Polaris can be found at ``<qmcpack_source>/config/build_alcf_polaris_Clang.sh``

Installing on ORNL OLCF Summit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Summit is an IBM system at the ORNL OLCF built with IBM Power System AC922
nodes. They have two IBM Power 9 processors and six NVIDIA Volta V100
accelerators.

Building QMCPACK
^^^^^^^^^^^^^^^^

As of April 2023, LLVM Clang (>=15) is the only compiler, validated by QMCPACK developers,
on Summit for OpenMP offloading computation to NVIDIA GPUs.

For ease of reproducibility we provide build scripts for Summit.

::

cd qmcpack
./config/build_olcf_summit_Clang.sh
ls build_*/bin

Running QMCPACK
^^^^^^^^^^^^^^^
Job script example with one MPI rank per GPU.

::

#!/bin/bash
# Begin LSF directives
#BSUB -P MAT151
#BSUB -J test
#BSUB -o tst.o%J
#BSUB -W 60
#BSUB -nnodes 1
#BSUB -alloc_flags smt1
# End LSF directives and begin shell commands

module load gcc/9.3.0
module load spectrum-mpi
module load cuda
module load essl
module load netlib-lapack
module load hdf5/1.10.7
module load fftw
# private module until OLCF provides a new llvm build
module use /gpfs/alpine/mat151/world-shared/opt/modules
module load llvm/release-15.0.0-cuda11.0

NNODES=$(((LSB_DJOB_NUMPROC-1)/42))
RANKS_PER_NODE=6
RS_PER_NODE=6

exe_path=/gpfs/alpine/mat151/world-shared/opt/qmcpack/release-3.16.0/build_summit_Clang_offload_cuda_real/bin

prefix=NiO-fcc-S1-dmc

export OMP_NUM_THREADS=7
jsrun -n $NNODES -a $RANKS_PER_NODE -c $((RANKS_PER_NODE*OMP_NUM_THREADS)) -g 6 -r 1 -d packed -b packed:$OMP_NUM_THREADS \
--smpiargs="-disable_gpu_hooks" $exe_path/qmcpack --enable-timers=fine $prefix.xml >& $prefix.out
Installing on ALCF Aurora
~~~~~~~~~~~~~~~~~~~~~~~~~~
Aurora is a 10,624 node HPE Cray EX based system. It has 166 racks with 21,248 CPUs and 63,744 GPUs.
Each node consists of 2 Intel Xeon CPU Max 9470C (codename Sapphire Rapids or SPR) with on-package HBM
and 6 Intel Data Center GPU Max 1550 (codename Ponte Vecchio or PVC).
Each Xeon has 52 physical cores supporting 2 hardware threads per core and 64GB of HBM. Each CPU has 512 GB of DDR5.
A build recipe for Aurora can be found at ``<qmcpack_source>/config/build_alcf_aurora_icpx.sh``

Installing on ORNL OLCF Frontier/Crusher
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -965,113 +895,6 @@ Job script example with one MPI rank per GPU.
srun -n $TOTAL_RANKS --ntasks-per-node=$RANKS_PER_NODE --gpus-per-task=1 -c $THREAD_SLOTS --gpu-bind=closest \
$exe_path/qmcpack --enable-timers=fine $prefix.xml >& $prefix.out

Installing on NERSC Cori, Haswell Partition, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cori is a Cray XC40 that includes 16-core Intel "Haswell" nodes
installed at NERSC. In the following example, the source code is
cloned in \$HOME/qmc/git\_QMCPACK and QMCPACK is built in the scratch
space.

::

mkdir $HOME/qmc
mkdir $HOME/qmc/git_QMCPACK
cd $HOME/qmc_git_QMCPACK
git clone https://github.com/QMCPACK/qmcpack.git
cd qmcpack
git checkout v3.7.0 # Edit for desired version
export CRAYPE_LINK_TYPE=dynamic
module unload cray-libsci
module load boost/1.70.0
module load cray-hdf5-parallel
module load cmake/3.14.4
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
cd $SCRATCH
mkdir build_cori_hsw
cd build_cori_hsw
cmake -DQMC_SYMLINK_TEST_FILES=0 -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment $HOME/qmc/git_QMCPACK/qmcpack/
nice make -j 8
ls -l bin/qmcpack

When the preceding was tested on June 15, 2020, the following module and
software versions were present:

::

build_cori_hsw> module list
Currently Loaded Modulefiles:
1) modules/3.2.11.4 13) xpmem/2.2.20-7.0.1.1_4.8__g0475745.ari
2) nsg/1.2.0 14) job/2.2.4-7.0.1.1_3.34__g36b56f4.ari
3) altd/2.0 15) dvs/2.12_2.2.156-7.0.1.1_8.6__g5aab709e
4) darshan/3.1.7 16) alps/6.6.57-7.0.1.1_5.10__g1b735148.ari
5) intel/19.0.3.199 17) rca/2.2.20-7.0.1.1_4.42__g8e3fb5b.ari
6) craype-network-aries 18) atp/2.1.3
7) craype/2.6.2 19) PrgEnv-intel/6.0.5
8) udreg/2.3.2-7.0.1.1_3.29__g8175d3d.ari 20) craype-haswell
9) ugni/6.0.14.0-7.0.1.1_7.32__ge78e5b0.ari 21) cray-mpich/7.7.10
10) pmi/5.0.14 22) craype-hugepages2M
11) dmapp/7.1.1-7.0.1.1_4.43__g38cf134.ari 23) gcc/8.3.0
12) gni-headers/5.0.12.0-7.0.1.1_6.27__g3b1768f.ari 24) cmake/3.14.4

The following slurm job file can be used to run the tests:

::

#!/bin/bash
#SBATCH --qos=debug
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --tasks-per-node=32
#SBATCH --constraint=haswell
echo --- Start `date`
echo --- Working directory: `pwd`
ctest -VV -R deterministic
echo --- End `date`

Installing on NERSC Cori, Xeon Phi KNL partition, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cori is a Cray XC40 that includes Intel Xeon Phi Knight's Landing (KNL) nodes. The following build recipe ensures that the code
generation is appropriate for the KNL nodes. The source is assumed to
be in \$HOME/qmc/git\_QMCPACK/qmcpack as per the Haswell example.

::

export CRAYPE_LINK_TYPE=dynamic
module swap craype-haswell craype-mic-knl # Only difference between Haswell and KNL recipes
module unload cray-libsci
module load boost/1.70.0
module load cray-hdf5-parallel
module load cmake/3.14.4
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
cd $SCRATCH
mkdir build_cori_knl
cd build_cori_knl
cmake -DQMC_SYMLINK_TEST_FILES=0 -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment $HOME/qmc/git_QMCPACK/qmcpack/
nice make -j 8
ls -l bin/qmcpack

When the preceding was tested on June 15, 2020, the following module and
software versions were present:

::

build_cori_knl> module list
Currently Loaded Modulefiles:
1) modules/3.2.11.4 13) xpmem/2.2.20-7.0.1.1_4.8__g0475745.ari
2) nsg/1.2.0 14) job/2.2.4-7.0.1.1_3.34__g36b56f4.ari
3) altd/2.0 15) dvs/2.12_2.2.156-7.0.1.1_8.6__g5aab709e
4) darshan/3.1.7 16) alps/6.6.57-7.0.1.1_5.10__g1b735148.ari
5) intel/19.0.3.199 17) rca/2.2.20-7.0.1.1_4.42__g8e3fb5b.ari
6) craype-network-aries 18) atp/2.1.3
7) craype/2.6.2 19) PrgEnv-intel/6.0.5
8) udreg/2.3.2-7.0.1.1_3.29__g8175d3d.ari 20) craype-mic-knl
9) ugni/6.0.14.0-7.0.1.1_7.32__ge78e5b0.ari 21) cray-mpich/7.7.10
10) pmi/5.0.14 22) craype-hugepages2M
11) dmapp/7.1.1-7.0.1.1_4.43__g38cf134.ari 23) gcc/8.3.0
12) gni-headers/5.0.12.0-7.0.1.1_6.27__g3b1768f.ari 24) cmake/3.14.4

Installing on systems with ARMv8-based processors
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down

0 comments on commit 8872b17

Please sign in to comment.