Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hooks track operation #739

Merged
merged 82 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
bf647a8
added the track_operations.py file
melodyyzh Feb 15, 2023
d330db1
added util.py file
melodyyzh Feb 15, 2023
e15f0be
modified track_operations.py
melodyyzh Feb 15, 2023
3345cb1
Merge branch 'master' into feature/hooks-TrackOperation
melodyyzh Mar 1, 2023
f18c07e
Updated code for modern signac-flow api.
klywang Mar 1, 2023
ef2751e
modified __init__.py, util.py, track_operations.py files
melodyyzh Mar 9, 2023
2787f2a
Merge branch 'feature/hooks-TrackOperation' of https://github.com/glo…
melodyyzh Mar 10, 2023
9c64821
Flake8 changes.
klywang Mar 29, 2023
c151c00
Added more tests to make sure metadata for on start matches with expe…
klywang Mar 29, 2023
a2f7e7f
Added test for success and exception hooks.
klywang Mar 29, 2023
f802f64
Merge branch 'main' into feature/hooks-TrackOperation
klywang Apr 5, 2023
bb2b8ad
Added git tracking files from execution-hooks branch.
klywang Apr 5, 2023
b36ca27
Fixed error and started work on git tests.
klywang Apr 5, 2023
d649690
Fixed git util file.
klywang Apr 13, 2023
90dcaee
Ignore tests that require git if git is not installed.
klywang Apr 13, 2023
edd8fde
Changed operation name in pytest for strict git false.
klywang Apr 13, 2023
e7fd3f8
Started strict git tests.
klywang Apr 13, 2023
f866571
Merge branch 'main' into feature/hooks-TrackOperation
klywang Apr 19, 2023
1328b5c
Updated to project.path.
klywang Apr 19, 2023
16cdcbd
Implemented git dirty tests.
klywang Apr 19, 2023
36dc228
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 19, 2023
ec374de
Updated changelog.
klywang Apr 19, 2023
f9c29a0
Merge branch 'feature/hooks-TrackOperation' of https://github.com/glo…
klywang Apr 19, 2023
362f1d8
Update changelog.txt
melodyyzh Jun 2, 2023
c602605
Update flow/hooks/git_util.py
melodyyzh Jun 2, 2023
e65215d
added doc strings and removed unused variables in test_project.py
melodyyzh Jun 19, 2023
4a084c9
Merge branch 'main' into feature/hooks-TrackOperation
melodyyzh Jun 19, 2023
fc617e0
update tests to avoid issue with python 3.8
melodyyzh Jun 22, 2023
8f87ffb
made additional edits on the docstrings
melodyyzh Jun 25, 2023
fc8d024
Update flow/hooks/git_util.py
melodyyzh Jul 5, 2023
acde532
Update flow/hooks/track_operations.py
melodyyzh Jul 5, 2023
58cc823
Fixed import issue.
klywang Jul 5, 2023
057e0e8
Initialize repository for git test.
klywang Jul 5, 2023
4883020
Test if git repository is dirty.
klywang Jul 5, 2023
252553c
Removed old unused file.
klywang Jul 5, 2023
29e1dd7
Use project instead of _project and removed unneeded old string.
klywang Jul 5, 2023
ac439fc
Removed __call__
klywang Jul 5, 2023
7658eed
Updated docstring and removed more instances of _project.
klywang Jul 5, 2023
08d89e9
Merge branch 'main' into feature/hooks-TrackOperation
klywang Jul 7, 2023
f26410f
Updated changelog.
klywang Jul 7, 2023
836dac0
Merge branch 'main' into feature/hooks-TrackOperation
b-butler Oct 11, 2023
63a08c1
test: Add GitPython to test dependencies
b-butler Oct 13, 2023
b68866c
doc: apply suggestions/corrections to documentation
b-butler Oct 18, 2023
957b295
doc: Add TrackOperations
b-butler Oct 18, 2023
af11f3f
feat: Add support for storing metadata in job document.
b-butler Oct 18, 2023
b459955
doc: Refactor documentation style for collect_metadata
b-butler Oct 18, 2023
1707bf0
fix: Prior feature to write to document.
b-butler Oct 19, 2023
cb55e29
test: Test new TrackOperations changes/features
b-butler Oct 19, 2023
9f3adfd
refactor: Simplify metadata collection logic
b-butler Oct 19, 2023
6d262f0
doc: Update collect_metadata docstring's possible todo
b-butler Oct 19, 2023
a74b05b
ci: Fix oldest dependency tests without git.
b-butler Oct 20, 2023
19962a4
test: Split TrackOperation project into git_strict and not
b-butler Oct 20, 2023
98b3261
test: Actually fix test errors without git.
b-butler Oct 23, 2023
551e456
refactor: Store git info if possible when strict_git=False
b-butler Oct 26, 2023
19d318b
refactor: change handling of erroring on dirty git
b-butler Oct 26, 2023
1843d91
test: Refactor TestHooksTrackOperationsNotStrict with more options
b-butler Oct 26, 2023
2ef3f04
test: Fix condition test (make more strict).
b-butler Oct 27, 2023
a8efc65
test: Use == on project path and remove comment
b-butler Oct 27, 2023
8888e47
doc: Correct TrackOperations docstring.
b-butler Oct 27, 2023
1d9498b
test: Fix path test for CI with detailed comment on necessity
b-butler Oct 27, 2023
f97ee59
Merge branch 'main' into feature/hooks-TrackOperation
b-butler Nov 2, 2023
1923a4c
doc: Apply Bradley's fixes
b-butler Nov 7, 2023
bad7e30
refactor: metadata schema
b-butler Nov 7, 2023
d961499
test: Require GitPython
b-butler Nov 7, 2023
31d5f8c
test: Skip check for GitHub Actions MacOS runners.
b-butler Nov 7, 2023
0e458cc
doc: Correct incomplete sentence.
b-butler Nov 7, 2023
0535e25
refactor: Switch back to filebased logging
b-butler Nov 7, 2023
4c2207c
doc: Document the schema keys in the hook.
b-butler Nov 7, 2023
6c84699
doc: Fix documentation for current code.
b-butler Nov 7, 2023
60007ad
ci: Update oldest requirements.
b-butler Nov 7, 2023
4d5aa94
doc: Fix documentation rendering.
b-butler Nov 7, 2023
8761ce0
doc: Fix nested list formatting.
b-butler Nov 7, 2023
514eda6
doc: Fix method reference.
b-butler Nov 7, 2023
d83d0e5
ci: Try to get working version of GitPython
b-butler Nov 7, 2023
387ff7b
ci (WIP): Test if latest GitPython works.
b-butler Nov 7, 2023
d8876fe
Merge branch 'main' into feature/hooks-TrackOperation
b-butler Nov 8, 2023
fc08084
Merge branch 'main' into feature/hooks-TrackOperation
b-butler Nov 13, 2023
a7bbb4a
doc: Fix typos
b-butler Nov 14, 2023
6411bd5
refactor: Change default name of file
b-butler Dec 4, 2023
815f07e
Merge branch 'main' into feature/hooks-TrackOperation
b-butler Dec 4, 2023
c759957
misc: Formatting to make pre-commit pass.
b-butler Dec 4, 2023
d35908d
test: Refactor conditional to not use pass
b-butler Dec 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci-oldest-reqs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ pytest-cov==3.0.0
pytest==7.0.1
ruamel.yaml==0.17.21
tqdm==4.60.0
GitPython==3.1.37
2 changes: 2 additions & 0 deletions changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Added
- ``test-workflow`` CLI option for testing template environments/submission scripts (#747).
- Frontier environment and template (#743).
- Added ``-o`` / ``--operation`` flag to report project status information for specific operations (#725).
- Added builtin `TrackOperations` execution hooks (#739).

Changed
+++++++
Expand Down Expand Up @@ -89,6 +90,7 @@ Added
+++++

- Added the OLCF Crusher environment (#708).
- Added `TrackOperations` as a built in hook (#739).

Changed
+++++++
Expand Down
6 changes: 6 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,12 @@ Aggregation

.. autofunction:: flow.get_aggregate_id

Hooks
-----

.. autoclass:: flow.hooks.TrackOperations
:members:

Compute Environments
--------------------

Expand Down
5 changes: 3 additions & 2 deletions flow/hooks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Copyright (c) 2018 The Regents of the University of Michigan
# Copyright (c) 2023 The Regents of the University of Michigan
# All rights reserved.
# This software is licensed under the BSD 3-Clause License.
"""Operation hooks."""
from .hooks import _Hooks
from .track_operations import TrackOperations

__all__ = ["_Hooks"]
__all__ = ["_Hooks", "TrackOperations"]
15 changes: 15 additions & 0 deletions flow/hooks/git_util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (c) 2023 The Regents of the University of Michigan
# All rights reserved.
# This software is licensed under the BSD 3-Clause License.
"""Define a function to collect metadata with git."""
import git


def collect_git_metadata(job):
"""Collect git metadata for a given workspace.

The information includes the commit ID and a flag indicating if the
repository is dirty (has uncommitted changes).
"""
repo = git.Repo(job.project.path)
return {"commit_id": str(repo.commit()), "dirty": repo.is_dirty()}
205 changes: 205 additions & 0 deletions flow/hooks/track_operations.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Copyright (c) 2023 The Regents of the University of Michigan
# All rights reserved.
# This software is licensed under the BSD 3-Clause License.
"""Built in execution hook for basic tracking."""
import json

from .util import collect_metadata

try:
from .git_util import collect_git_metadata
except ImportError:
GIT = False

Check warning on line 12 in flow/hooks/track_operations.py

View check run for this annotation

Codecov / codecov/patch

flow/hooks/track_operations.py#L11-L12

Added lines #L11 - L12 were not covered by tests
else:
GIT = True


_DEFAULT_FILENAME = "signac-execution-history.log"
b-butler marked this conversation as resolved.
Show resolved Hide resolved


class TrackOperations:
b-butler marked this conversation as resolved.
Show resolved Hide resolved
""":class:`~.TrackOperations` tracks information about the execution of operations to a logfile.

This hook can provides information on the start, successful completion, and/or error of
one or more operations in a :class:`~.FlowProject` instance. The logs are stored in the file
given by ``log_filename`` within the job's path. The file will be appended to if it already
b-butler marked this conversation as resolved.
Show resolved Hide resolved
exists.

The hooks stores metadata regarding the execution of the operation and the state of the
project at the time of execution, error, and/or completion. The data will also include
information about the git status if the project is detected as a git repo and
``GitPython`` is installed in the environment.

Each call to the hook adds a single JSON line to the log file. These can be
read using the `json` builtin package or :meth:`~.TrackOperations.read_log`.

The current schema has the following structure:

- ``time``: The time of querying the metadata.
- ``stage``: Whether the hook was executed either "prior" or "after" the associated
operation's execution.
- ``error``: The error message on executing the operation if any.
- ``project``
- ``path``: Filepath to the project
- ``schema_version``: The project's schema version
- ``operation``: The operation name
- ``job_id``: The job id
- ``git``
b-butler marked this conversation as resolved.
Show resolved Hide resolved
- ``commit_id``: The current commit of the project's git repo.
- ``dirty``: Whether the project's repo has uncommitted changes or not.
- ``_schema_version``: The metadata storage's schema version. Schema is currently in version 1.
b-butler marked this conversation as resolved.
Show resolved Hide resolved


Warning
-------
This class will raise an exception when strict_git is set to ``True`` and either GitPython is
b-butler marked this conversation as resolved.
Show resolved Hide resolved
b-butler marked this conversation as resolved.
Show resolved Hide resolved
not available or the repository contains uncommitted changes (i.e. is "dirty").

Examples
--------
The following example will install :class:`~.TrackOperations` at the operation level.

.. code-block:: python
b-butler marked this conversation as resolved.
Show resolved Hide resolved

from flow import FlowProject
from flow.hooks import TrackOperations


class Project(FlowProject):
pass


track = TrackOperation()


@track.install_operation_hooks(Project)
@Project.operation
def foo(job):
pass


The code block below provides an example of how install :class:`~.TrackOperations` to a
instance of :class:`~.FlowProject`

.. code-block:: python
b-butler marked this conversation as resolved.
Show resolved Hide resolved

from flow import FlowProject
from flow.hooks import TrackOperations


class Project(FlowProject):
pass


if __name__ == "__main__":
project = Project()
project = TrackOperations().install_project_hooks(project)
project.main()


Parameters
----------
log_filename : str, optional
The name of the log file in the job workspace. Defaults to "signac-execution-history.log".
b-butler marked this conversation as resolved.
Show resolved Hide resolved
strict_git : bool, optional
Whether to fail if ``GitPython`` cannot be imported or if there are uncommitted changes
to the project's git repo. Defaults to ``True``.
"""

def __init__(self, log_filename=_DEFAULT_FILENAME, strict_git=True):
self.log_filename = log_filename
if strict_git and not GIT:
raise RuntimeError(

Check warning on line 112 in flow/hooks/track_operations.py

View check run for this annotation

Codecov / codecov/patch

flow/hooks/track_operations.py#L112

Added line #L112 was not covered by tests
"Unable to collect git metadata from the repository, "
"because the GitPython package is not installed.\n\n"
"You can use '{}(strict_git=False)' to ignore this "
"error.".format(type(self).__name__)
)
self.strict_git = strict_git

def _write_metadata(self, job, metadata):
with open(job.fn(self.log_filename), "a") as logfile:
logfile.write(json.dumps(metadata) + "\n")

def _get_metadata(self, operation, job, stage, error=None):
"""Define log_operation to collect metadata of job workspace and write to logfiles."""
# Add execution-related information to metadata.
metadata = {"stage": stage, "error": None if error is None else str(error)}
metadata.update(collect_metadata(operation, job))
if GIT:
git_metadata = collect_git_metadata(job)
if self.strict_git and git_metadata["dirty"]:
raise RuntimeError(
"Unable to reliably log operation, because the git repository in "
"the project root directory is dirty.\n\nMake sure to commit all "
"changes or ignore this warning by setting '{}(strict_git=False)'.".format(
type(self).__name__
)
)
metadata["project"]["git"] = git_metadata
return metadata

def on_start(self, operation, job):
"""Track the start of execution of an operation on a job."""
self._write_metadata(job, self._get_metadata(operation, job, stage="prior"))

def on_success(self, operation, job):
"""Track the successful completion of an operation on a job."""
self._write_metadata(job, self._get_metadata(operation, job, stage="after"))

def on_exception(self, operation, error, job):
"""Log errors raised during the execution of an operation on a job."""
self._write_metadata(
job, self._get_metadata(operation, job, stage="after", error=error)
)

def install_operation_hooks(self, op, project_cls=None):
"""Decorate operation to track execution.

Parameters
----------
op : function or type
An operation function to log or a subclass of :class:`~.FlowProject` if
``project_cls`` is ``None``.
project_cls : type
A subclass of :class:`~.FlowProject`.
"""
if project_cls is None:
return lambda func: self.install_operation_hooks(func, op)
project_cls.operation_hooks.on_start(self.on_start)(op)
project_cls.operation_hooks.on_success(self.on_success)(op)
project_cls.operation_hooks.on_exception(self.on_exception)(op)
return op

def install_project_hooks(self, project):
"""Install hooks to track all operations in a `flow.FlowProject`.

Parameters
----------
project : flow.FlowProject
The project to install hooks on.
"""
project.project_hooks.on_start.append(self.on_start)
project.project_hooks.on_success.append(self.on_success)
project.project_hooks.on_exception.append(self.on_exception)
return project

@classmethod
def read_log(cls, job, log_filename=_DEFAULT_FILENAME):
"""Return the execution log data as a list of dictionaries.

Parameters
----------
job : signac.job.Job
The job to read the execution history of.
log_filename : str, optional
The name of the log file in the job workspace. Defaults to
"signac-execution-history.log".
b-butler marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
log : list[dict[str, any]]
Returns the job's current execution history for logged operations.
"""
with open(job.fn(log_filename)) as fh:
return [json.loads(line) for line in fh.readlines() if line != ""]
25 changes: 25 additions & 0 deletions flow/hooks/util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Copyright (c) 2023 The Regents of the University of Michigan
# All rights reserved.
# This software is licensed under the BSD 3-Clause License.
"""Define a function to collect metadata on the operation and job."""
from datetime import datetime, timezone


def collect_metadata(operation, job):
"""Collect metadata related to the operation and job.

Returns a directory including schema version, time, project, and job-operation.

"""
return {
# the metadata schema version:
"_schema_version": "1",
"time": datetime.now(timezone.utc).isoformat(),
"project": {
"path": job.project.path,
# the project schema version:
"schema_version": job.project.config.get("schema_version"),
},
"operation": operation,
"job_id": job.id,
}
1 change: 1 addition & 0 deletions requirements/requirements-test.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
click==8.1.7
coverage==7.3.2
GitPython==3.1.37
pytest-cov==4.1.0
pytest==7.4.3
ruamel.yaml==0.18.3
34 changes: 34 additions & 0 deletions tests/define_hooks_track_operations_project.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from define_hooks_test_project import HOOKS_ERROR_MESSAGE

from flow import FlowProject
from flow.hooks import TrackOperations


class _HooksTrackOperations(FlowProject):
pass


LOG_FILENAME = "signac-execution-history.log"
b-butler marked this conversation as resolved.
Show resolved Hide resolved


track_operations = TrackOperations(strict_git=False)


@track_operations.install_operation_hooks(_HooksTrackOperations)
@_HooksTrackOperations.operation
def base(job):
if job.sp.raise_exception:
raise RuntimeError(HOOKS_ERROR_MESSAGE)


@track_operations.install_operation_hooks(_HooksTrackOperations)
@_HooksTrackOperations.operation(cmd=True, with_job=True)
def cmd(job):
if job.sp.raise_exception:
return "exit 42"
else:
return "touch base_cmd.txt"


if __name__ == "__main__":
_HooksTrackOperations().main()
34 changes: 34 additions & 0 deletions tests/define_hooks_track_operations_strict_project.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from define_hooks_test_project import HOOKS_ERROR_MESSAGE

from flow import FlowProject
from flow.hooks import TrackOperations


class _HooksTrackOperations(FlowProject):
pass


LOG_FILENAME = "operations.log"


track_operations = TrackOperations()


@track_operations.install_operation_hooks(_HooksTrackOperations)
@_HooksTrackOperations.operation
def base(job):
if job.sp.raise_exception:
raise RuntimeError(HOOKS_ERROR_MESSAGE)


@track_operations.install_operation_hooks(_HooksTrackOperations)
@_HooksTrackOperations.operation(cmd=True, with_job=True)
def cmd(job):
if job.sp.raise_exception:
return "exit 42"
else:
return "touch base_cmd.txt"


if __name__ == "__main__":
_HooksTrackOperations().main()
Loading