Update actions checkout and setup-python (fixes warning). #277

fdmalone · 2023-12-05T20:04:48Z

The mpi tests failing was a red herring. The examples were failing because of #278, this sent a signal to the other tests to fail but I'm guessing the error was only received on the root process so the MPI jobs would hang. I've set a timeout for these for the moment but there may be a more sensible thing to do.

fdmalone · 2023-12-10T01:20:55Z

I found several problems:

openmpi > 4.15 has an issue with bcast see: Deadlock in mpi_bcast using openmpi 4.1.5 from fortran open-mpi/ompi#11478. Using the fix suggested there in the mpirun launch fixes deadlocks.
example 02 is flaky (fails on CI (zero walker weight) but can't reproduce reliably locally.
mpich errors on finalize (so can't replace openmpi with mpich in CI).

1. fix msd-afqmc green's function with gpu change `walker_batch.Ga.fill(0.0 + 0.0j)` to `walker_batch.Ga = xp.zeros_like(walker_batch.Ga)` since cupy does not have cupy.ndarray.fill 2. fix initial walker of msd trial from ``` elif isinstance(trial, ParticleHole): initial_walker = numpy.hstack([trial.psi0a, trial.psi0b]) ``` to ``` elif isinstance(trial, ParticleHole): initial_walker = numpy.hstack([trial.psi0a, trial.psi0b]) random_walker = numpy.random.random(initial_walker.shape) initial_walker = initial_walker + random_walker initial_walker, _ = numpy.linalg.qr(initial_walker) ``` Otherwise cause issues. 3. fix the integration. In #277, the msd example was disabled. Now fixed with the second point mentioned above. 4. Modify the example for running msd-afqmc with MPI / GPU

fdmalone added 12 commits December 5, 2023 20:04

Update actions checkout and setup-python (fixes warning).

fdbb5d6

Fix typo.

b7660d2

Update actions.

4efa5ec

Add missing comment.

f6fe461

Correct timeout location and don't fail fast for mpi tests.

8ed0fb4

Test.

d5a2599

Fix import.

05293c7

Add scf input.

c88688e

More logging.

df5e74a

Restore actions.

3f29206

Remove old readme.

cb152ab

Temporarily stop example from running.

84f8631

fdmalone force-pushed the update_actions branch from 9e7558a to 84f8631 Compare December 10, 2023 01:18

fdmalone merged commit e1c4854 into develop Dec 10, 2023
7 checks passed

fdmalone deleted the update_actions branch December 10, 2023 01:23

jiangtong1000 mentioned this pull request Dec 5, 2024

bugfix of gpu msd-afqmc; fix tutorial #327

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update actions checkout and setup-python (fixes warning). #277

Update actions checkout and setup-python (fixes warning). #277

fdmalone commented Dec 5, 2023 •

edited

Loading

fdmalone commented Dec 10, 2023 •

edited

Loading

Update actions checkout and setup-python (fixes warning). #277

Update actions checkout and setup-python (fixes warning). #277

Conversation

fdmalone commented Dec 5, 2023 • edited Loading

fdmalone commented Dec 10, 2023 • edited Loading

fdmalone commented Dec 5, 2023 •

edited

Loading

fdmalone commented Dec 10, 2023 •

edited

Loading