Skip to content

Commit

Permalink
Merge pull request #134 from mir-group/phonopy_fix
Browse files Browse the repository at this point in the history
Phonopy fix
  • Loading branch information
cepellotti authored Nov 16, 2021
2 parents ae7196b + ee31172 commit 6123485
Show file tree
Hide file tree
Showing 8 changed files with 33 additions and 23 deletions.
18 changes: 11 additions & 7 deletions doc/sphinx/source/tutorials/elEpaTransport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ From an installation folder of your choice, type::
git clone https://github.com/mir-group/phoebe-quantum-espresso.git
cd phoebe-quantum-espresso
# install it
git checkout phoebe-qe-6.6
git checkout phoebe-qe-6.7.0
./configure MPIF90=mpif90 --with-scalapack=yes
make pw pp ph w90

Expand Down Expand Up @@ -163,13 +163,14 @@ In the working folder ``./example/Silicon-epa/qe-elph`` run the command::
If the code run successfully, you should see a new file ``silicon.fc``.


Step 5: Run nscf
-----------------
Step 5: Non-self-consistent run
-------------------------------

Before we can run Phoebe, we need to complete one more step using Quantum ESPRESSO. We need to use an nscf run to calculate the electronic properties on the k-point mesh. We do so using the input file in the ``Silicon-epa`` example folder::
Before we can run Phoebe, we need to complete one more step using Quantum ESPRESSO. We need to use an non-self-consistent run run to calculate the electronic properties on the k-point mesh.
We do so using the input file ``bands.in`` in the ``Silicon-epa`` example folder::

&control
calculation = "nscf"
calculation = "bands"
restart_mode = "from_scratch"
prefix = "silicon"
pseudo_dir = "../../pseudoPotentials/"
Expand Down Expand Up @@ -197,11 +198,14 @@ Before we can run Phoebe, we need to complete one more step using Quantum ESPRES
0.00000000 0.00000000 0.16666667 4.629630e-03
...

where the k-points list will continue for all 216 points. To generate this k-point list, one could use the ``kmesh.pl`` utility from Wannier90 (in the directory ``q-e/wannier90-3.0.0/utility/kmesh.pl``, used as ``kmesh.pl nk1 nk2 nk3``, with the output appended to the end of ``nscf.in``).
where the k-points list will continue for all 216 points. To generate this k-point list, one could use the ``kmesh.pl`` utility from Wannier90 (in the directory ``q-e/wannier90-3.0.0/utility/kmesh.pl``, used as ``kmesh.pl nk1 nk2 nk3``, with the output appended to the end of ``bands.in``).

.. note::
The ``calculation`` parameter should be set to ``bands`` and not ``nscf``.

We run this as we did the ``pw.x`` step::

mpirun -np 4 /path/to/patched-quantum-espresso/bin/pw.x -npool 4 -in nscf.in > nscf.out
mpirun -np 4 /path/to/patched-quantum-espresso/bin/pw.x -npool 4 -in bands.in > bands.out

where again this could be parallelized using ``mpi`` and ``npool``.

Expand Down
13 changes: 7 additions & 6 deletions doc/sphinx/source/tutorials/elWanTransport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -179,20 +179,20 @@ If the code run successfully, you should see a new file ``silicon.fc``.



Step 5: Run nscf
-----------------
Step 5: Non-self-consistent run
-------------------------------

We now start the process of Wannierizing the electronic band structure.
Before running Wannier90, we need to compute the electronic band structure on the full grid of k-points as a starting point for the Wannier calculation.
You can check that the ``nscf.in`` file is essentially identical to the `scf.in` file, except that we:

* Modified the parameter ``calculation = "bands"``, which indicates to QE that we will use the charge density computed in Step 2 to recompute the wavefunctions.
You can check that the ``bands.in`` file is essentially identical to the `scf.in` file, except that we:

* Modified the parameter ``calculation = "bands"``, which indicates to QE that we will use the charge density computed in Step 2 to recompute the wavefunctions. Don't set this parameter to ``"nscf"``.

* Instead of using the keyword ``K_POINTS automatic, 6 6 6 0 0 0``, we explicitly write the coordinates of all :math:`6^3` k-points. These can be generated using the helper script provided by Wannier90, ``q-e/wannier90-3.0.0/utility/kmesh.pl``, run on the command line by specifying the k-mesh used in the scf calculation. For example, ``kmesh.pl 6 6 6`` will produce the k-point list.

To run it, type::

mpirun -np 4 /path/to/phoebe-quantum-espresso/bin/pw.x -in nscf.in > nscf.out
mpirun -np 4 /path/to/phoebe-quantum-espresso/bin/pw.x -in bands.in > bands.out


Step 6: Wannierization
Expand Down Expand Up @@ -491,3 +491,4 @@ The sections on parallelization discussed for the phonon transport app apply to

* **For any calculation where memory is an issue:** To parallelize your calculation for cases where memory is an issue, set the number of MPI processes equal to the number of nodes, and set the number of OMP threads equal to the number of cores in the node. This will allow each process to use all the memory on a node, while still getting parallel performace benefit from the OMP threads. If applicable, the number of GPUs should match the number of MPI processes.

* **Optimize the MAXMEM parameter:** MAXMEM is relevant to tune the memory used during the interpolation of the el-ph coupling. For CPU-only runs, MAXMEM isn't critical to performance, and can be set to a relatively small value (e.g. 1Gb), much smaller than the memory available to each MPI process. For GPU-accelerated runs, set MAXMEM to the GPU on-board memory (e.g. ``export MAXMEM=16`` to tell Phoebe that the GPU has 16 Gb of on-board memory).
9 changes: 6 additions & 3 deletions doc/sphinx/source/tutorials/phononTransport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -280,8 +280,8 @@ In this tutorial we show a demo calculation, which is unconverged for the sake o



Parallelization
----------------
Parallelization and performance
-------------------------------

As mentioned above, for the ``qeToPhoebe`` calculation, the primary method of parallelization is over OMP threads, as this calculation can be memory intensive, and OMP helps to alleviate this. For this reason, we've written the code to be sped up when using more OMP threads.

Expand All @@ -291,7 +291,8 @@ Phoebe takes advantage of three different parallelization schemes for the phonon

* **MPI parallelization.** We distinguish two cases. If we want to compute the action of matrix :math:`\sum_{k'b'} A_{k,k',b,b'} f_{k'b'}`, we MPI-distribute over rows of wavevectors to achieve the best performance. If we want to store the matrix in memory, we parallelize over pairs of wavevectors using the ScaLAPACK layout. This distributes the scattering matrix in memory, reducing the required memory per process, and also speeds up operations on the matrix.

* **Kokkos parallelization.** The calculation of the phonon-phonon coupling required by the phonon transport app can also be accelerated with Kokkos. Depending on your architecture and installation parameters, Kokkos will either run on GPUs, or CPUs with OpenMP acceleration. In the former case, remember to set the environment variable ``export MAXMEM=4`` in the job submission script, or in the command line, to set the available GPU on-board memory (4GB in this example).
* **Kokkos acceleration.** The calculation of the phonon-phonon coupling required by the phonon transport app can also be accelerated with Kokkos. Depending on your architecture and installation parameters, Kokkos will either run on GPUs, or CPUs with OpenMP acceleration.
Especially for GPU-accelerated runs, remember to optimize the environment variable ``export MAXMEM=4`` in the job submission script, or in the command line, to set the available GPU on-board memory (4GB in this example). For CPU-only runs, ``MAXMEM`` instead can be set to a small value of memory, smaller and not greater than the memory available to a MPI process.

* **OpenMP parallelization.** The summations over band indices when computing the scattering rates is accelerated using OpenMP. This can be accelerated by increasing the environment variable ``OMP_NUM_THREADS``.

Expand All @@ -302,6 +303,8 @@ Phoebe takes advantage of three different parallelization schemes for the phonon
* Set the number of OpenMP threads equal to the number of physical cores available on each computing node. This will accelerate the band summations while still having these processes share the memory of the node.

* Compile Phoebe with Kokkos. If you do so, make sure that the number of GPUs you are using matches the number of MPI processes. If you don't have a GPU, Kokkos can still accelerate the phonon-phonon calculations via the number of OpenMP threads you've set.
For CPU-only runs, set MAXMEM to a value smaller or at most equal to the total memory available to a MPI process (in this case MAXMEM has a small impact on performance, but increases significantly memory usage).
For GPU-accelerated runs, set MAXMEM equal to the total memory available on the GPU.


Tradeoff between speed and memory
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
&control
calculation = "nscf"
calculation = "bands"
restart_mode = "from_scratch"
prefix = "silicon"
pseudo_dir = "../../pseudoPotentials/"
Expand Down
2 changes: 1 addition & 1 deletion example/Silicon-el/qe-elph/runMe.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in scf.in > scf.out
mpirun -np $NMPI $QE_PATH/ph.x -npool $NPOOL -in ph.in > ph.out
$QE_PATH/q2r.x -in q2r.in > q2r.out

mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in nscf.in > nscf.out
mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in bands.in > bands.out
$QE_PATH/wannier90.x -pp si
mpirun -np $NMPI $QE_PATH/pw2wannier90.x -in pw2wan.in > pw2wan.out
$QE_PATH/wannier90.x si
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
&control
calculation = "nscf"
calculation = "bands"
restart_mode = "from_scratch"
prefix = "silicon"
pseudo_dir = "../../pseudoPotentials/"
outdir = "./out"
verbosity='high'
/
&system
ibrav = 2
Expand Down
2 changes: 1 addition & 1 deletion example/Silicon-epa/qe-elph/runMe.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ export NPOOL=4
mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in scf.in > scf.out
mpirun -np $NMPI $QE_PATH/ph.x -npool $NPOOL -in ph.in > ph.out
$QE_PATH/q2r.x -in q2r.in > q2r.out
mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in nscf.in > nscf.out
mpirun -np $NMPI $QE_PATH/pw.x -npool $NPOOL -in bands.in > bands.out
7 changes: 4 additions & 3 deletions lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ include(ExternalProject)
include(FetchContent)

FetchContent_Declare(googletest
GIT_REPOSITORY https://github.com/google/googletest.git
SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR}/googletest
UPDATE_COMMAND ""
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG "release-1.11.0"
SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR}/googletest
UPDATE_COMMAND ""
)
FetchContent_MakeAvailable(googletest)

Expand Down

0 comments on commit 6123485

Please sign in to comment.