Merge pull request #134 from mir-group/phonopy_fix

Phonopy fix
phoebe-team · Nov 16, 2021 · 6123485 · 6123485
2 parents ae7196b + ee31172
commit 6123485
Show file tree

Hide file tree

Showing 8 changed files with 33 additions and 23 deletions.
diff --git a/doc/sphinx/source/tutorials/elEpaTransport.rst b/doc/sphinx/source/tutorials/elEpaTransport.rst
@@ -22,7 +22,7 @@ From an installation folder of your choice, type::
     git clone https://github.com/mir-group/phoebe-quantum-espresso.git
     cd phoebe-quantum-espresso
     # install it
-    git checkout phoebe-qe-6.6
+    git checkout phoebe-qe-6.7.0
     ./configure MPIF90=mpif90 --with-scalapack=yes
     make pw pp ph w90
 
@@ -163,13 +163,14 @@ In the working folder ``./example/Silicon-epa/qe-elph`` run the command::
 If the code run successfully, you should see a new file ``silicon.fc``.
 
 
-Step 5: Run nscf
------------------
+Step 5: Non-self-consistent run
+-------------------------------
 
-Before we can run Phoebe, we need to complete one more step using Quantum ESPRESSO. We need to use an nscf run to calculate the electronic properties on the k-point mesh. We do so using the input file in the ``Silicon-epa`` example folder::
+Before we can run Phoebe, we need to complete one more step using Quantum ESPRESSO. We need to use an non-self-consistent run run to calculate the electronic properties on the k-point mesh.
+We do so using the input file ``bands.in`` in the ``Silicon-epa`` example folder::
 
   &control
-    calculation = "nscf"
+    calculation = "bands"
     restart_mode = "from_scratch"
     prefix = "silicon"
     pseudo_dir = "../../pseudoPotentials/"
@@ -197,11 +198,14 @@ Before we can run Phoebe, we need to complete one more step using Quantum ESPRES
     0.00000000  0.00000000  0.16666667  4.629630e-03
     ...
 
-where the k-points list will continue for all 216 points. To generate this k-point list, one could use the ``kmesh.pl`` utility from Wannier90 (in the directory ``q-e/wannier90-3.0.0/utility/kmesh.pl``, used as ``kmesh.pl nk1 nk2 nk3``, with the output appended to the end of ``nscf.in``).
+where the k-points list will continue for all 216 points. To generate this k-point list, one could use the ``kmesh.pl`` utility from Wannier90 (in the directory ``q-e/wannier90-3.0.0/utility/kmesh.pl``, used as ``kmesh.pl nk1 nk2 nk3``, with the output appended to the end of ``bands.in``).
+
+.. note::
+   The ``calculation`` parameter should be set to ``bands`` and not ``nscf``.
 
 We run this as we did the ``pw.x`` step::
 
-    mpirun -np 4 /path/to/patched-quantum-espresso/bin/pw.x -npool 4 -in nscf.in > nscf.out
+    mpirun -np 4 /path/to/patched-quantum-espresso/bin/pw.x -npool 4 -in bands.in > bands.out
 
 where again this could be parallelized using ``mpi`` and ``npool``.
 

diff --git a/doc/sphinx/source/tutorials/elWanTransport.rst b/doc/sphinx/source/tutorials/elWanTransport.rst
@@ -179,20 +179,20 @@ If the code run successfully, you should see a new file ``silicon.fc``.
 
 
 
-Step 5: Run nscf
------------------
+Step 5: Non-self-consistent run
+-------------------------------
 
 We now start the process of Wannierizing the electronic band structure.
 Before running Wannier90, we need to compute the electronic band structure on the full grid of k-points as a starting point for the Wannier calculation.
-You can check that the ``nscf.in`` file is essentially identical to the `scf.in` file, except that we:
-
-* Modified the parameter ``calculation = "bands"``, which indicates to QE that we will use the charge density computed in Step 2 to recompute the wavefunctions.
+You can check that the ``bands.in`` file is essentially identical to the `scf.in` file, except that we:
 
+* Modified the parameter ``calculation = "bands"``, which indicates to QE that we will use the charge density computed in Step 2 to recompute the wavefunctions. Don't set this parameter to ``"nscf"``.
+
 * Instead of using the keyword ``K_POINTS automatic, 6 6 6 0 0 0``, we explicitly write the coordinates of all :math:`6^3` k-points. These can be generated using the helper script provided by Wannier90, ``q-e/wannier90-3.0.0/utility/kmesh.pl``, run on the command line by specifying the k-mesh used in the scf calculation. For example, ``kmesh.pl 6 6 6`` will produce the k-point list.
 
 To run it, type::
 
-  mpirun -np 4 /path/to/phoebe-quantum-espresso/bin/pw.x -in nscf.in > nscf.out
+  mpirun -np 4 /path/to/phoebe-quantum-espresso/bin/pw.x -in bands.in > bands.out
 
 
 Step 6: Wannierization
@@ -491,3 +491,4 @@ The sections on parallelization discussed for the phonon transport app apply to
 
 * **For any calculation where memory is an issue:** To parallelize your calculation for cases where memory is an issue, set the number of MPI processes equal to the number of nodes, and set the number of OMP threads equal to the number of cores in the node. This will allow each process to use all the memory on a node, while still getting parallel performace benefit from the OMP threads. If applicable, the number of GPUs should match the number of MPI processes.
 
+* **Optimize the MAXMEM parameter:** MAXMEM is relevant to tune the memory used during the interpolation of the el-ph coupling. For CPU-only runs, MAXMEM isn't critical to performance, and can be set to a relatively small value (e.g. 1Gb), much smaller than the memory available to each MPI process. For GPU-accelerated runs, set MAXMEM to the GPU on-board memory (e.g. ``export MAXMEM=16`` to tell Phoebe that the GPU has 16 Gb of on-board memory).
diff --git a/doc/sphinx/source/tutorials/phononTransport.rst b/doc/sphinx/source/tutorials/phononTransport.rst
@@ -280,8 +280,8 @@ In this tutorial we show a demo calculation, which is unconverged for the sake o
 
 
 
-Parallelization
-----------------
+Parallelization and performance
+-------------------------------
 
 As mentioned above, for the ``qeToPhoebe`` calculation, the primary method of parallelization is over OMP threads, as this calculation can be memory intensive, and OMP helps to alleviate this. For this reason, we've written the code to be sped up when using more OMP threads.
 
@@ -291,7 +291,8 @@ Phoebe takes advantage of three different parallelization schemes for the phonon
 
 * **MPI parallelization.** We distinguish two cases. If we want to compute the action of matrix :math:`\sum_{k'b'} A_{k,k',b,b'} f_{k'b'}`, we MPI-distribute over rows of wavevectors to achieve the best performance. If we want to store the matrix in memory, we parallelize over pairs of wavevectors using the ScaLAPACK layout. This distributes the scattering matrix in memory, reducing the required memory per process, and also speeds up operations on the matrix.
 
-* **Kokkos parallelization.** The calculation of the phonon-phonon coupling required by the phonon transport app can also be accelerated with Kokkos. Depending on your architecture and installation parameters, Kokkos will either run on GPUs, or CPUs with OpenMP acceleration. In the former case, remember to set the environment variable ``export MAXMEM=4`` in the job submission script, or in the command line, to set the available GPU on-board memory (4GB in this example).
+* **Kokkos acceleration.** The calculation of the phonon-phonon coupling required by the phonon transport app can also be accelerated with Kokkos. Depending on your architecture and installation parameters, Kokkos will either run on GPUs, or CPUs with OpenMP acceleration.
+  Especially for GPU-accelerated runs, remember to optimize the environment variable ``export MAXMEM=4`` in the job submission script, or in the command line, to set the available GPU on-board memory (4GB in this example). For CPU-only runs, ``MAXMEM`` instead can be set to a small value of memory, smaller and not greater than the memory available to a MPI process.
 
 * **OpenMP parallelization.** The summations over band indices when computing the scattering rates is accelerated using OpenMP. This can be accelerated by increasing the environment variable ``OMP_NUM_THREADS``.
 
@@ -302,6 +303,8 @@ Phoebe takes advantage of three different parallelization schemes for the phonon
 * Set the number of OpenMP threads equal to the number of physical cores available on each computing node. This will accelerate the band summations while still having these processes share the memory of the node.
 
 * Compile Phoebe with Kokkos. If you do so, make sure that the number of GPUs you are using matches the number of MPI processes. If you don't have a GPU, Kokkos can still accelerate the phonon-phonon calculations via the number of OpenMP threads you've set.
+  For CPU-only runs, set MAXMEM to a value smaller or at most equal to the total memory available to a MPI process (in this case MAXMEM has a small impact on performance, but increases significantly memory usage).
+  For GPU-accelerated runs, set MAXMEM equal to the total memory available on the GPU.
 
 
 Tradeoff between speed and memory

diff --git a/example/Silicon-el/qe-elph/nscf.in → example/Silicon-el/qe-elph/bands.in b/example/Silicon-el/qe-elph/nscf.in → example/Silicon-el/qe-elph/bands.in
@@ -1,5 +1,5 @@
 &control
-  calculation = "nscf"
+  calculation = "bands"
   restart_mode = "from_scratch"
   prefix = "silicon"
   pseudo_dir = "../../pseudoPotentials/"

diff --git a/example/Silicon-el/qe-elph/runMe.sh b/example/Silicon-el/qe-elph/runMe.sh
@@ -11,7 +11,7 @@ mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in scf.in > scf.out
 mpirun -np $NMPI $QE_PATH/ph.x -npool $NPOOL -in ph.in > ph.out
 $QE_PATH/q2r.x -in q2r.in > q2r.out
 
-mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in nscf.in > nscf.out
+mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in bands.in > bands.out
 $QE_PATH/wannier90.x -pp si
 mpirun -np $NMPI $QE_PATH/pw2wannier90.x -in pw2wan.in > pw2wan.out
 $QE_PATH/wannier90.x si
diff --git a/example/Silicon-epa/qe-elph/nscf.in → example/Silicon-epa/qe-elph/bands.in b/example/Silicon-epa/qe-elph/nscf.in → example/Silicon-epa/qe-elph/bands.in
@@ -1,9 +1,10 @@
 &control
-  calculation = "nscf"
+  calculation = "bands"
   restart_mode = "from_scratch"
   prefix = "silicon"
   pseudo_dir = "../../pseudoPotentials/"
   outdir = "./out"
+  verbosity='high'
 /
 &system
   ibrav = 2

diff --git a/example/Silicon-epa/qe-elph/runMe.sh b/example/Silicon-epa/qe-elph/runMe.sh
@@ -10,4 +10,4 @@ export NPOOL=4
 mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in scf.in > scf.out
 mpirun -np $NMPI $QE_PATH/ph.x -npool $NPOOL -in ph.in > ph.out
 $QE_PATH/q2r.x -in q2r.in > q2r.out
-mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in nscf.in > nscf.out
+mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in bands.in > bands.out
diff --git a/lib/CMakeLists.txt b/lib/CMakeLists.txt
@@ -4,9 +4,10 @@ include(ExternalProject)
 include(FetchContent)
 
 FetchContent_Declare(googletest
-    GIT_REPOSITORY https://github.com/google/googletest.git
-    SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR}/googletest
-    UPDATE_COMMAND ""
+  GIT_REPOSITORY https://github.com/google/googletest.git
+  GIT_TAG "release-1.11.0"
+  SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR}/googletest
+  UPDATE_COMMAND ""
 )
 FetchContent_MakeAvailable(googletest)