Skip to content

Commit

Permalink
PDVec (#183)
Browse files Browse the repository at this point in the history
* Move `fciqmc_step!` into Intefaces (?)

* Rework initiator dvec

* Move operator-vector function to DictVectors

* Add PDVecs and friends

* some fixes and reworking

* Fix MPI communication

* Fix mpi_recv_any signature

* Add tests

* Fix bugs with dot and initiators, add MPI tests

* Remove annihilation tracking workaround

* Benchmark PDVec instead

* Some cleaning in Hamiltonains, make PDVec the default.

* Fix error caused by moving things around

* Doc updates

* Move hacks to pdworkingmemory

* temp

* Fix up after rebase

* Change which project mpi_runtests is run from

* instantiate MPI deps before test

* Another attempt to fix MPI CI

* Attempt to fix MPI CI again

* using -> import

* Add debugging info to job

* Fix benchmark, do MPI first

* MPI tests without MPI

* Fix project paths

* Fix install of MPI deps

* Fix deps

* Fix Pkg stuff

* Fix deprecations in RMPI

* Restore ordering, fixing errors.

* Fix three-argument dotting issues

* Benchmark with threads=auto

* Attempt to add coverage for MPI

* Fix more depwarns in RMPI

* Fix even more depwarns; fix benchmark

* Fix depwarn

* Fix MPI probing

* Split testing into phases

* Fix yml

* Re-rework actions

* Change MPI coverage tracking

* Remove dead code

* Removed too much dead code

* Remove foreach calls that modify LHS.

* Reordering, more docs

* Fix the doctest

* Fix tests

* Fix bug in AllOverlaps

* Change example to regular code listing

* Minor fix in tests, bugfix

* skip some tests

* Improve coverage and fix bugs

* Remove folds from testset

* Add distributed dot tests, remove dead code, reenable threads in testing

* fix yml

* Fix randomness in dot tests

* Docstring fixup

* Alleviate type instability

* Make example numerically stable

* Refix type instability

* Fix doctests

* enable colors, don't test PDVec docstring

* Update src/DictVectors/communicators.jl

Co-authored-by: christofbradly <[email protected]>

* Update src/strategies_and_params/replicastrategy.jl

Co-authored-by: christofbradly <[email protected]>

* Change default vector choice, include it in docs.

* Rename functions

* Fix up docs

* Keep executor in similar, fix `length` for working memory

* Use PDVec in MPI example

* Add an extension for using Rimu with KrylovKit (#187)

* Add an extension for using Rimu with KrylovKit

* Add the actual extension

* Loosen the version requirement

* Add KrylovKit to [extras]

* Fix broken dot

* Use starting address that does not produce a vector that is orthogonal to the groundstate

* Remove method that provides starting vector.

* apply bitstring fix

* `PDVec` with tuples (#188)

* Replace vectors with tuples, remove executors

* add fix

* doc tweak

* Remove the option of setting the number of segments

* Fix bugs and tests

* single-segment optimization

* fix depwarn

* remove rogue num_segments

* improve test coverage

* fix mpi test

* fix some dots

* Merge branch 'feature/pdvec' of github.com:joachimbrand/Rimu.jl into feature/pdvec

* Fix equality for `DVec`

* Fix constructor and tests

* Make bangbang functions work properly

* checkpoint

* checkpoint

* set version for release

* Fixes after merge

* Update Project.toml version

* Fix KrylovKit extension

* fix MPI tests

* Fix mul with deterministic vectors

* `add!` -> `dict_add!`

* docstring for `dict_add!`, remove `propagators.jl`

* Remove mentions of `OperatorMulPropagator`

* attempt at fixing type instability

* Introduce FrozenPDVec for faster projected energy

* Small fixes

* Fix diagonal operator spawning

* fix some type instabilities

* Rework benchmarks - trial run

* Update Project.toml

* Update tune file

* re-update benchmarks

* Remove type stability enforcing

* Add see also to all DVec types

* CI setup

* Remove unused parameter

* Fix compression bug

* Fix the benchmark setup

* Fix docs issues

* Fix compression issues

* Doc improvements

* Fix bug in tests, docstring

* Docstring fix

* Remove unused dependency, clean up example script

* Review comments

* Add MPI docs, review comments

* Apply suggestions from code review

Co-authored-by: Joachim Brand <[email protected]>
Co-authored-by: christofbradly <[email protected]>

* Update MPI doc

* add `PDVecKeys/Vals/Pairs` docstrings

* filter(!), out-of-place map, fix potential bug in map!

* Fix doc links

* Fix new code

* Another bugfix

* One more

* Fix bug in filter!

* Fix wrong method call

* Remove type from `f` in `filter`

* Fix method errors

* Update docs/src/mpi.md

Co-authored-by: christofbradly <[email protected]>

* bump version

* bump version

---------

Co-authored-by: christofbradly <[email protected]>
Co-authored-by: Joachim Brand <[email protected]>
Co-authored-by: Joachim Brand <[email protected]>
  • Loading branch information
4 people authored Nov 10, 2023
1 parent b836c95 commit 2825da8
Show file tree
Hide file tree
Showing 32 changed files with 2,243 additions and 206 deletions.
10 changes: 9 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Rimu"
uuid = "c53c40cc-bd84-11e9-2cf4-a9fde2b9386e"
authors = ["Joachim Brand <[email protected]>"]
version = "0.9.1-dev"
version = "0.10.0"

[deps]
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
Expand Down Expand Up @@ -39,6 +39,12 @@ TerminalLoggers = "5d786b92-1e48-4d6f-9151-6b4477ca9bed"
TupleTools = "9d95972d-f1c8-5527-a6e0-b4b365fa01f6"
VectorInterface = "409d34a3-91d5-4945-b6ec-7529ddf182d8"

[weakdeps]
KrylovKit = "0b1a1467-8014-51b9-945f-bf0ae24f4b77"

[extensions]
KrylovKitExt = "KrylovKit"

[compat]
Arrow = "1.5, 2"
BSON = "0.3"
Expand All @@ -50,6 +56,7 @@ Distributions = "0.25"
FFTW = "1"
Folds = "0.2"
HypergeometricFunctions = "0.3"
KrylovKit = "0.6"
MPI = "0.20"
MacroTools = "0.5"
Measurements = "2"
Expand All @@ -72,6 +79,7 @@ VectorInterface = "0.2, 0.3, 0.4"
julia = "1.7"

[extras]
KrylovKit = "0b1a1467-8014-51b9-945f-bf0ae24f4b77"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
Expand Down
18 changes: 9 additions & 9 deletions benchmark/benchmarks.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,16 @@ const SUITE = @benchmarkset "Rimu" begin
M = 16
addr = FermiFS2C(M, 1 => 1, 2 => 1, 1 => -1, 2 => -1)
ham = HubbardRealSpace(addr; geometry=PeriodicBoundaries(4, 4))
dv = DVec(addr => 1.0)
eigsolve(ham, dv, 1, :SR; issymmetric=true, tol=1e-9)
dv = PDVec(addr => 1.0)
eigsolve(ham, dv, 1, :SR; tol=1e-9)
end seconds=30

@case "Bose-Hubbard in momentum space" begin
M = N = 10
addr = BoseFS(M, M ÷ 2 => N)
ham = HubbardMom1D(addr; u=6.0)
dv = DVec(addr => 1.0)
eigsolve(ham, dv, 1, :SR; issymmetric=true, tol=1e-9)
dv = PDVec(addr => 1.0)
eigsolve(ham, dv, 1, :SR; tol=1e-9)
end seconds=40
end

Expand All @@ -28,7 +28,7 @@ const SUITE = @benchmarkset "Rimu" begin
M = 20
addr = BoseFS(M, M÷2 => 10)
ham = HubbardMom1D(addr; u=6.0)
dv1 = DVec(addr => 1.0)
dv1 = PDVec(addr => 1.0)
dv2 = zerovector(dv1)
mul!(dv2, ham, dv1)
mul!(dv1, ham, dv2)
Expand All @@ -42,7 +42,7 @@ const SUITE = @benchmarkset "Rimu" begin
M = 30
addr = FermiFS2C(M, M÷2-1 => 1, M => 1, M÷2 => -1, M÷2+1 => -1)
ham = Transcorrelated1D(addr)
dv1 = DVec(addr => 1.0)
dv1 = PDVec(addr => 1.0)
dv2 = zerovector(dv1)
mul!(dv2, ham, dv1)
mul!(dv1, ham, dv2)
Expand All @@ -56,7 +56,7 @@ const SUITE = @benchmarkset "Rimu" begin
@case "(10, 20) Mom space with projected energy and initiator" begin
addr = BoseFS(20, 10 => 10)
ham = HubbardMom1D(addr, u=1.0)
dv = InitiatorDVec(addr => 1.0; style=IsDynamicSemistochastic())
dv = PDVec(addr => 1.0; style=IsDynamicSemistochastic(), initiator=true)
post_step = ProjectedEnergy(ham, dv)
s_strat = DoubleLogUpdate(targetwalkers=40_000)

Expand All @@ -66,7 +66,7 @@ const SUITE = @benchmarkset "Rimu" begin
@case "(4+1, 11) 2C Mom space with G2Correlators" begin
addr = BoseFS2C(ntuple(i -> ifelse(i == 5, 4, 0), 11), ntuple(==(5), 11))
ham = BoseHubbardMom1D2C(addr, v=0.1)
dv = DVec(addr => 1.0f0; style=IsDynamicSemistochastic{Float32}())
dv = PDVec(addr => 1.0f0; style=IsDynamicSemistochastic{Float32}())
s_strat = DoubleLogUpdate(targetwalkers=10_000)
replica = AllOverlaps(2, ntuple(i -> G2Correlator(i - 1), 7))

Expand All @@ -76,7 +76,7 @@ const SUITE = @benchmarkset "Rimu" begin
@case "(50, 50) Real space" begin
addr = near_uniform(BoseFS{50,50})
ham = HubbardReal1D(addr, u=6.0)
dv = DVec(addr => 1.0; style=IsDynamicSemistochastic())
dv = PDVec(addr => 1.0; style=IsDynamicSemistochastic())
s_strat = DoubleLogUpdate(targetwalkers=50_000)

lomc!(ham, dv; s_strat, dτ=1e-4, laststep=1000)
Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ makedocs(;
"Examples" => EXAMPLES_PAIRS[sortperm(EXAMPLES_NUMS)],
"User documentation" => [
"StatsTools" => "statstools.md",
"Using MPI" => "mpi.md",
],
"Developer documentation" => [
"Interfaces" => "interfaces.md",
Expand Down
19 changes: 19 additions & 0 deletions docs/src/dictvectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ AbstractDVec
```@docs
DVec
InitiatorDVec
PDVec
```

## Interface functions
Expand Down Expand Up @@ -52,6 +53,8 @@ NormProjector
Norm2Projector
UniformProjector
Norm1ProjectorPPop
Rimu.DictVectors.FrozenDVec
Rimu.DictVectors.FrozenPDVec
```

## Initiator rules
Expand All @@ -70,6 +73,22 @@ Rimu.DictVectors.NonInitiator
Rimu.DictVectors.NonInitiatorValue
```

## `PDVec` internals

### Working memory

```@autodocs
Modules = [DictVectors]
Pages = ["pdworkingmemory.jl"]
```

### Communicators

```@autodocs
Modules = [DictVectors]
Pages = ["communicators.jl"]
```

## Index
```@index
Pages = ["dictvectors.md"]
Expand Down
11 changes: 6 additions & 5 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The code supports parallelisation with MPI (harnessing [`MPI.jl`](https://github

**Concept:** Joachim Brand and Elke Pahl.

**Contributors:** Joachim Brand, Elke Pahl, Mingrui Yang, Matija Cufar, Chris Bradly.
**Contributors:** Joachim Brand, Elke Pahl, Mingrui Yang, Matija Čufar, Chris Bradly.

Discussions, help, and additional contributions are acknowledged by Ali Alavi,
Didier Adrien, Chris Scott (NeSI), Alexander Pletzer (NeSI).
Expand Down Expand Up @@ -110,10 +110,11 @@ on the same hardware makes use of 4 cores and the main part completes in 1.04
seconds, a speedup factor of 2.6. This seems reasonable, given that extra work
needs to be done for communicating between different processes.

Using MPI parallelism with `Rimu` is easy. Enabling MPI for use in [`lomc!()`](@ref)
is done by wrapping the primary data structures as [`MPIData`](@ref Main.Rimu.RMPI.MPIData). As a consequence, data will
be stored in a distributed fashion among the MPI ranks and only communicated between ranks when
necessary. The relevant functionality is provided by the module `Rimu.RMPI`.
Using MPI parallelism with `Rimu` is easy. Enabling MPI for use in [`lomc!()`](@ref) is
enabled automatically if [`PDVec`](@ref) is used to store a vector. In that case, data will
be stored in a distributed fashion among the MPI ranks and only communicated between ranks
when necessary. Additional MPI-related functionality is provided by the module [`RMPI`](@ref
Rimu.RMPI).

## References
The code implements the FCIQMC algorithm originally described in
Expand Down
92 changes: 92 additions & 0 deletions docs/src/mpi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Working with MPI

If you are using [`PDVec`](@ref Main.DictVectors.PDVec)s to store your vectors, working with
MPI should be fairly straightforward. Generally, [`PDVec`](@ref Main.DictVectors.PDVec) will
work with MPI automatically, as long as MPI is set up correctly and a few common pitfalls
are avoided.

Rimu includes an unexported module [`RMPI`](@ref Main.Rimu.RMPI), which must be imported to access
additional MPI-related functionality.

## Configuring MPI

When running on a cluster, ensure that MPI.jl is using the system binary. See [the MPI.jl
documentation](https://juliaparallel.org/MPI.jl/latest/configuration/) for more information.

It is always a good idea to start your script with a quick test that ensures the MPI is set up correctly. One way to do this is to open with

```julia
using Rimu.RMPI
mpi_allprintln("hello")
```

which should print something like

```
[ rank 0: hello
[ rank 1: hello
[ rank 2: hello
[ rank 3: hello
```

If it prints `rank 0` several times, the code will run, but ranks will not communicate.

## Using Slurm

When using [`PDVec`](@ref Main.DictVectors.PDVec), the recommended setup is to use threads to parallelise the
computation within a node, and to only use MPI for inter-node communication. In a slurm
script, this is done as follows:

```bash
...
#SBATCH --ntasks-per-node=1
#SBATCH --nodes=4 # replace 4 with the desired number of nodes
#SBATCH --cpus-per-task=28 # replace 28 with the number of cores available in a node
#SBATCH --hint=nomultithread # don't use hyperthreading
...

srun julia --project -tauto script.jl
```

On some clusters, additional settings must be used with `srun`, for example the CTCP cluster
requires the following.

```bash
srun mpi=pmi2 julia --project -tauto script.jl
```

## Common pitfalls with reducing functions

### Using `@mpi_root`

Take care to not use reducing functions (such as `length`, `sum`, `norm`, ...) inside
[`@mpi_root`](@ref Main.Rimu.RMPI.@mpi_root) blocks. Doing so will only initiate the
distributed reduction on one rank only, which will cause the code to go out of sync and
freeze. As an example, to report the current length of a vector, calculate the length before
the [`@mpi_root`](@ref Main.Rimu.RMPI.@mpi_root) block:

```julia
len = length(pdvec)
@mpi_root println("vector length is $len")
```

## Threaded operations and reductions

When using functions that take anonymous functions, such as `map(!)`, `sum`, or `mapreduce`, it is important that the anonymous functions passed to them do not perform any MPI-reducing operations (`length`, `norm`, `sum`, etc.). These anonymous functions are executed on multiple threads and initiating MPI communication from multiple threads may cause issues.

As an example, suppose we want to scale a vector by its length by using `map!`. The correct way to write this code is as

```julia
len = length(pdvec)
map!(values(pdvec)) do x
x / len
end
```

Similar to the previous example, `len` is calculated first and not within the body of
`map!`. In this specific case, an even better option is to use the `scale!` function from
[VectorInterface.jl](https://github.com/Jutho/VectorInterface.jl):

```julia
scale!(pdvec, 1 / length(pdvec))
```
49 changes: 49 additions & 0 deletions ext/KrylovKitExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
module KrylovKitExt

using Rimu
using KrylovKit
using LinearAlgebra

const U = Union{Symbol,EigSorter}

"""
OperatorMultiplier
A struct that holds the working memory for repeatedly multiplying vectors with an operator.
"""
struct OperatorMultiplier{H,W<:PDWorkingMemory}
hamiltonian::H
working_memory::W
end
function OperatorMultiplier(hamiltonian, vector::PDVec)
return OperatorMultiplier(hamiltonian, PDWorkingMemory(vector; style=IsDeterministic()))
end

function (o::OperatorMultiplier)(v)
result = zerovector(v)
return mul!(result, o.hamiltonian, v, o.working_memory)
end

function KrylovKit.eigsolve(
ham::AbstractHamiltonian, vec::PDVec, howmany::Int=1, which::U=:LR; kwargs...
)
# Change the type of `vec` to float, if needed.
v = scale!!(vec, 1.0)
prop = OperatorMultiplier(ham, v)
return eigsolve(
prop, v, howmany, which;
ishermitian=ishermitian(ham), issymmetric=issymmetric(ham), kwargs...
)
end

# This method only exists to detect whether a Hamiltonian is Hermitian or not.
function KrylovKit.eigsolve(
ham::AbstractHamiltonian, vec::AbstractDVec, howmany::Int=1, which::U=:LR; kwargs...
)
return @invoke eigsolve(
ham::Any, vec::Any, howmany, which;
ishermitian=ishermitian(ham), issymmetric=issymmetric(ham), kwargs...
)
end

end
22 changes: 11 additions & 11 deletions scripts/BHM-example-mpi.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# # Example 2: Rimu with MPI

# In this example, we will demonstrate using Rimu with MPI.
# In this example, we will demonstrate using Rimu with MPI.

# A runnable script for this example is located
# A runnable script for this example is located
# [here](https://github.com/joachimbrand/Rimu.jl/blob/develop/scripts/BHM-example-mpi.jl).
# Run it with `mpirun julia BHM-example-mpi.jl`.

Expand All @@ -16,17 +16,17 @@ using Rimu.RMPI

# First, we define the Hamiltonian. We want to start from an address with zero momentum.

address = BoseFS((0, 0, 0, 0, 10, 0, 0, 0, 0, 0))
address = BoseFS(10, 5 => 10)

# We will set the interaction strength `u` to `6`. The hopping strength `t` defaults to `1.0`.

hamiltonian = HubbardMom1D(address; u=6.0)

# Next, we construct the starting vector. Wrap a vector in `MPIData` to make it MPI
# distributed. We set the vector's style to [`IsDynamicSemistochastic`](@ref), which
# improves statistics and reduces the sign problem.
# Next, we construct the starting vector. We use a `PDVec`, which is automatically MPI
# distributed if MPI is available. We set the vector's stochastic style to
# [`IsDynamicSemistochastic`](@ref), which improves statistics and reduces the sign problem.

dvec = MPIData(DVec(address => 1.0; style=IsDynamicSemistochastic()))
dvec = PDVec(address => 1.0; style=IsDynamicSemistochastic())

# We set a reporting strategy. We will use [`ReportToFile`](@ref), which writes the reports
# directly to a file. This is useful for reducing memory use in long-running jobs, as we
Expand All @@ -43,7 +43,7 @@ s_strat = DoubleLogUpdate(targetwalkers=10_000)
post_step = ProjectedEnergy(hamiltonian, dvec)

# The `@mpi_root` macro performs an action on the root rank only, which is useful for printing.

@mpi_root println("Running FCIQMC with ", mpi_size(), " rank(s).")

# Finally, we can run the computation.
Expand All @@ -54,6 +54,6 @@ using Test #hide
@test isfile("result.arrow") #hide
dfr = load_df("result.arrow") #hide
qmcdata = last(dfr, 5000) #hide
(qmcShift,qmcShiftErr) = mean_and_se(qmcdata.shift) #hide
@test qmcShift -6.5 atol=0.5 #hide
rm("result.arrow", force=true) #hide
qmc_shift, _ = mean_and_se(qmcdata.shift) #hide
@test qmc_shift -6.5 atol=0.5 #hide
rm("result.arrow", force=true) #hide
Loading

2 comments on commit 2825da8

@joachimbrand
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/95162

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.10.0 -m "<description of version>" 2825da848bdd3c088af561d892322a6deb86bf41
git push origin v0.10.0

Please sign in to comment.