Make major improvements to documentation

Add more sections, design guide, detailed tutorials.
michellab · Jun 16, 2024 · fd62994 · fd62994
1 parent 6543b70
commit fd62994
Show file tree

Hide file tree

Showing 26 changed files with 405 additions and 1,142 deletions.
diff --git a/a3fe/run/_simulation_runner.py b/a3fe/run/_simulation_runner.py
@@ -745,8 +745,8 @@ def _refresh_logging(self) -> None:
             sub_sim_runner._refresh_logging()
 
     def recursively_get_attr(self, attr: str) -> _Dict[SimulationRunner, _Any]:
-        f"""
-        Get the values of the attribute for the {self.__class__.__name__} and any sub-simulation runners.
+        """
+        Get the values of the attribute for the simulation runner and any sub-simulation runners.
         If the attribute is not present for a sub-simulation runner, None is returned.
 
         Parameters
@@ -757,7 +757,7 @@ def recursively_get_attr(self, attr: str) -> _Dict[SimulationRunner, _Any]:
         Returns
         -------
         attr_values : Dict[SimulationRunner, Any]
-            A dictionary of the attribute values for the {self.__class__.__name__} and any sub-simulation runners.
+            A dictionary of the attribute values for the simulation runner and any sub-simulation runners.
         """
         attrs_dict = {}
         attrs_dict[attr] = getattr(self, attr, None)
@@ -773,8 +773,8 @@ def recursively_get_attr(self, attr: str) -> _Dict[SimulationRunner, _Any]:
     def recursively_set_attr(
         self, attr: str, value: _Any, force: bool = False, silent: bool = False
     ) -> None:
-        f"""
-        Set the attribute to the value for the {self.__class__.__name__} and any sub-simulation runners.
+        """
+        Set the attribute to the value for the simulation runner and any sub-simulation runners.
 
         Parameters
         ----------

diff --git a/docs/a3fe_design.rst b/docs/a3fe_design.rst
@@ -0,0 +1,92 @@
+a3fe Design
+============
+
+Software Design
+****************
+
+a3fe stores and manipulates simulations based on a heirarchy of "simulation runners". Each simulation runner
+is responsible for manipulating a set of "sub simulation runners". For example, :class:`a3fe.Calculation` instances hold and
+manipulate two :class:`a3fe.Leg` instances (bound and free), and :class:`a3fe.Leg` objects hold and manipulate :class:`a3fe.Stage` instances.
+The :class:`a3fe.Stage` objects for the first leg can be accessed through some Calculation instance ``calc`` with ``calc.legs[0].stages`` (whether the leg is bound
+or free can be quieried with ``calc.legs[0].leg_type``). All simulation runners are derived from the abstract base class :class:`a3fe.run._simulation_runner.SimulationRunner`, 
+where as much as possible of the functionality is defined. As a result, all simulation runners have similar interfaces e.g. they all share the run() and kill() methods.
+The heirarchy of simulation runners is:
+
+- :class:`a3fe.CalcSet`
+- :class:`a3fe.Calculation`
+- :class:`a3fe.Leg`
+- :class:`a3fe.Stage`
+- :class:`a3fe.LamWindow`
+- :class:`a3fe.Simulation`
+
+Calling calc.kill(), for example, recursively kills all sub simulation runners and their sub simulation runners. You can recursively set
+or get attributes for all sub simulation runners in the heirarchy with e.g. ``calc.recursively_set_attr("_equilibrated", True)`` or
+``calc.recursively_get_attr("_equilibrated")``.
+
+Each simulation runner logs to a file in its base directory named according to the class name (e.g. Calculation.log). In addition,
+each simulation runner saves a pickled version of itself to a file named in the same way (e.g. Calculation.pkl), which
+allows them to be restarted at any point. The pickle files are automatically detected and used to load the Simulation
+runners when they are present in the base directory. For example, running ``calc = a3.Calculation()`` in the base directory of
+an pre-prepared calculation will load the previous calculation, overwriting any arguments supplied to Calculation().
+The current state of a simulation runner can be written to the pickle file with the save() method, e.g. ``calc.save()``.
+
+Algorithms
+***********
+
+a3fe aims to run ABFE as efficiently as possible, while generating robust estimates of uncertainty. A user-specified number of 
+replicate simulations (this is specified when the simulation runner is created, e.g. ``calc = a3.Calculation(ensemble_size=5)``)
+are run (the default is 5). Running several replicates in parallel allows:
+
+- A reasonablly robust estimate of the uncertainty from the inter-run differences
+- More robust ensemble-based adaptive equilibration detection
+
+a3fe implements agorithms for:
+
+- Automatic determination of the optimal lambda spacing (e.g. ``calc.get_optimal_lam_vals()``)
+- Adaptive allocation of simulation time to minimise the inter-run uncertainty (e.g. ``calc.run(adaptive=True)``)
+- Adaptive equilibration detection (see :func:`a3fe.analyse.detect_equil.check_equil_multiwindow_paired_t`, used when ``calc.run(adaptive=True)`` is specified)
+
+For more details of the algorithms, please see ADD_PREPRINT_LINK.
+
+Some Notes on the Implementation
+*********************************
+
+a3fe is designed to be easily adaptable to any SLURM cluster. The SLURM submission settings can be tailored by modifying
+the header of ``run_somd.sh`` in the input directory.
+
+If the input is not parameterised, a3fe will parameterise your input with ff14SB, OFF 2.0.0, and TIP3P by default. See 
+:ref:`preparing input<preparing-input>`. By default, a3fe will solvate your system in a rhombic dodecahedral box with 150 mM NaCl
+and perform a standard minimisation, heating, and pre-equilibration routine.
+
+At present, a3fe uses GROMACS to run all set-up jobs, so please ensure that you have loaded the required CUDA and
+GROMACS modules, or sourced GMXRC. These GROMACS jobs are also submitted through SLURM, and a unique 5 ns "ensemble
+equilibration" simulation is run for each of the ``ensemble_size`` repeats. For the bound leg, these are used to extract
+different Boresch restraints for each replicate simulation using the in-built BioSimSpace algorithm (see
+`the BioSimSpace restraint selection code <https://github.com/fjclark/BioSimSpace/blob/01dba53b01386a3851e277874f9080c316c4632e/python/BioSimSpace/Sandpit/Exscientia/FreeEnergy/_restraint_search.py#L902>`_).
+This fits force constants of the Boresch restraints according to the fluctuations observed during the fitting simulations, and scores candidate restraints accorinding 
+to how severly they restrict the configurational space accessible to the ligand (more restriction is better as it indicates that the restraints are mimicking a 
+stronger native interaction).
+
+a3fe can use a default spacing of lambda windows which should work reasonably for most systems with the default SOMD
+settings. However, to optimise the lambda schedule by running short (100 ps default) simulations and generating a new spacing
+according to the integrated variance of the gradients, run ``calc.get_optimal_lam_vals()``.
+
+One weakness of a3fe is that the molecular dynamics engine used for production simulations (SOMD) does not support enhanced sampling; HREX is not available. However,
+this does mean that all individual SOMD simulations can be run in parallel. 
+
+Units
+******
+
++-------------------+----------+
+| Quantity          | Unit     |
++===================+==========+
+| Simulation Time   | ns       |
++-------------------+----------+
+| Computer Time     | hr       |
++-------------------+----------+
+| Energy            | kcal/mol |
++-------------------+----------+
+
+Note that when specifying the run-time of a calculation, this is per-window, per-replicate. For example, if you specify
+``calc.run(adaptive=False, runtime=1)`` and ``calc.ensemble_size==5``, then the total run-time for each window will be 5 ns. However,
+when you query the total simulation time with ``calc.tot_simtime``, this is the cumulative total for every simulation in the calculation.
diff --git a/docs/api.rst b/docs/api.rst
@@ -4,11 +4,24 @@ API Documentation
 .. autosummary::
    :toctree: autosummary
 
-   a3fe
-   a3fe.Calculation
-   a3fe.Leg
-   a3fe.Stage
-   a3fe.LamWindow
-   a3fe.Simulation
-   a3fe.run._simulation_runner.SimulationRunner
-   a3fe.enums
+   a3fe.run
+   a3fe.run._simulation_runner
+   a3fe.run.CalcSet
+   a3fe.run.Calculation
+   a3fe.run.Leg
+   a3fe.run.Stage
+   a3fe.run.LamWindow
+   a3fe.run.Simulation
+   a3fe.run.enums
+   a3fe.run.system_prep
+   a3fe.run._virtual_queue
+   a3fe.run._utils
+
+   a3fe.analyse
+   a3fe.analyse.analyse_set
+   a3fe.analyse.compare
+   a3fe.analyse.detect_equil
+   a3fe.analyse.mbar
+   a3fe.analyse.plot
+   a3fe.analyse.rmsd
+   a3fe.analyse.waters
diff --git a/docs/autosummary/EnsEquil.Calculation.rst b/docs/autosummary/EnsEquil.Calculation.rst
diff --git a/docs/autosummary/EnsEquil.LamWindow.rst b/docs/autosummary/EnsEquil.LamWindow.rst
diff --git a/docs/autosummary/EnsEquil.Leg.rst b/docs/autosummary/EnsEquil.Leg.rst
diff --git a/docs/autosummary/EnsEquil.LegType.rst b/docs/autosummary/EnsEquil.LegType.rst