Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic molecule labels from MoleculeFinder #392

Merged
merged 20 commits into from
Mar 28, 2024

Conversation

MBartkowiakSTFC
Copy link
Collaborator

Description of work
Some analysis types (AreaPerMolecule, AngularCorrelation) require the molecules in the system to have labels in order to be selected. However, not every MD engine assigns labels to molecules.

Fixes

  1. MoleculeFinder now assigns InChI strings to molecules it detects.
  2. AngularCorrelation now automatically sets axis to be from the first atom of the molecule to the molecule centre of mass.
  3. AreaPerMolecule now selects unit cell vectors defining the wall surface area as one of the three possibilities: 'ab', 'bc' or 'ac'.
  4. New widgets have been added (MultipleCombosWidget, VectorWidget) to allow running OrderParameter in the future.

To test

  1. Run AngularCorrelation and AreaPerMolecule on a DL_POLY trajectory with labeled molecules.
  2. Create labels in a trajectory from some other engine (e.g. CP2K) by running MoleculeFinder on it.
  3. Run AngularCorrelation and AreaPerMolecule on the new trajectory.

@ChiCheng45
Copy link
Collaborator

Looks good but a few points.

The angular correlation results can depend on whether it was run with a trajectory which had been modified with moleculefinder or not. This looks like it is because in some cases the coordinates are not contiguous so the results will be different. Maybe the atoms positions should be are changed to be contiguous during the calculation?

Would it be a good idea to set the axis in the angular correlation to something physical since at the moment the results will depend the atom ordering. Maybe we can change it to the molecules axis of rotation and we'd then have angular correlation results for the three molecular axes.

Molecular finder can crash when it is used with some bulk system like with the p1 cp2k system from MDANSE-Examples. I got the following error.

Traceback (most recent call last):
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 313, in run
    self.initialize()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\MoleculeFinder.py", line 82, in initialize
    inchistring = moltester.identify_molecule()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Chemistry\Structrures.py", line 55, in identify_molecule
    rdDetermineBonds.DetermineBonds(mol_object, charge=0)
Boost.Python.ArgumentError: Python argument types in
    rdkit.Chem.rdDetermineBonds.DetermineBonds(NoneType)
did not match C++ signature:
    DetermineBonds(class RDKit::ROMol {lvalue} mol, bool useHueckel=False, int charge=0, double covFactor=1.3, bool allowChargedFragments=True, bool embedChiral=True, bool useAtomMap=False, bool useVdw=False)  

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\xcb63893\AppData\Local\anaconda3\envs\MDANSE5\Lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE_GUI\Src\MDANSE_GUI\Subprocess\Subprocess.py", line 46, in run
    self._job_instance.run(self._job_parameters)
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 335, in run
    raise JobError(self, tb)
MDANSE.Framework.Jobs.IJob.JobError: Traceback (most recent call last):
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 313, in run
    self.initialize()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\MoleculeFinder.py", line 82, in initialize
    inchistring = moltester.identify_molecule()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Chemistry\Structrures.py", line 55, in identify_molecule
    rdDetermineBonds.DetermineBonds(mol_object, charge=0)
Boost.Python.ArgumentError: Python argument types in
    rdkit.Chem.rdDetermineBonds.DetermineBonds(NoneType)
did not match C++ signature:
    DetermineBonds(class RDKit::ROMol {lvalue} mol, bool useHueckel=False, int charge=0, double covFactor=1.3, bool allowChargedFragments=True, bool embedChiral=True, bool useAtomMap=False, bool useVdw=False)  

@MBartkowiakSTFC MBartkowiakSTFC marked this pull request as draft March 27, 2024 09:32
@MBartkowiakSTFC MBartkowiakSTFC marked this pull request as ready for review March 28, 2024 08:44
@MBartkowiakSTFC
Copy link
Collaborator Author

I managed to address two of the points you raised:

  1. Extra checks have been added to stop the MoleculeFinder from crashing on more tricky trajectories. CP2K P1 trajectory works now.
  2. AngularCorrelation now always uses contiguous coordinates. The current implementation is slow, but the results should now be independent of the way the trajectory was converted (i.e. with and without coordinate folding).

Also, I improved the docstring of MoleculeFinder.

@ChiCheng45
Copy link
Collaborator

Looks good, works on my end. I've open a proposal for some enhancements to molecule related jobs in #399 which include the axis proposal from my comment above.

@ChiCheng45 ChiCheng45 merged commit 8b74b3e into protos Mar 28, 2024
84 checks passed
@MBartkowiakSTFC MBartkowiakSTFC deleted the minimum-effort-atom-labels branch July 9, 2024 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants