Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add MetricWriter class #107

Merged
merged 1 commit into from
Oct 17, 2024
Merged

Conversation

msto
Copy link
Contributor

@msto msto commented May 5, 2024

Closes #88.

I've adapted the DataclassWriter from https://github.com/msto/dataclass_io/ to work with Metrics.

@msto msto requested review from nh13, tfenne and TedBrookings May 5, 2024 21:00
@msto msto mentioned this pull request May 6, 2024
Copy link
Member

@clintval clintval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction this is taking! Could you add some examples of how to use this in the __doc__ and check the docs compile correctly and look good too?

fgpyo/io/__init__.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
"""

delimiter: str = "\t"
comment: str = "#"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we anticipate multiple comment prefixes?

Suggested change
comment: str = "#"
comment_prefixes: set[str] = {"#"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unlikely to occur in practice and makes the interface clunkier.

Could you provide an example of a file format that would have multiple prefixes? I can't think of one off the top of my head.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCSC BED files come to mind, but they also lack a header with named fields entirely. This is one of those API decisions that's hard to go back on since it will be a breaking change in the future, but not now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding the API correctly, this wouldn't only activate for mixed comment prefixes in the same file, but allow for setting multiple comment prefixes across files too?

fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
Copy link
Member

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

fgpyo/io/__init__.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
"""

delimiter: str = "\t"
comment: str = "#"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Show resolved Hide resolved
@msto msto marked this pull request as draft May 27, 2024 16:46
@msto msto changed the base branch from main to ms_asdict May 27, 2024 16:46
@msto msto changed the title Add MetricWriter feat: Add MetricWriter class Jun 4, 2024
Base automatically changed from ms_asdict to ms_metric-writer-feature-branch June 4, 2024 17:26
This was referenced Jun 4, 2024
@msto msto force-pushed the ms_metric-writer branch 2 times, most recently from 837ebba to e712dfe Compare June 5, 2024 14:48
@msto msto changed the base branch from ms_metric-writer-feature-branch to ms_get-header June 5, 2024 14:48
@msto msto force-pushed the ms_metric-writer branch from e712dfe to 9677b57 Compare June 5, 2024 14:56
@msto msto changed the base branch from ms_get-header to ms_fix-rw-typing June 5, 2024 14:56
@msto msto force-pushed the ms_fix-rw-typing branch from 270c09d to c1c83fd Compare June 6, 2024 00:12
Base automatically changed from ms_fix-rw-typing to ms_metric-writer-feature-branch June 6, 2024 00:13
Copy link

codecov bot commented Jun 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (ms_metric-writer-assertions@30447a9). Learn more about missing BASE report.

Current head 032388d differs from pull request most recent head 8a6db00

Please upload reports for the commit 8a6db00 to get more accurate results.

Additional details and impacted files
@@                      Coverage Diff                       @@
##             ms_metric-writer-assertions     #107   +/-   ##
==============================================================
  Coverage                               ?   88.56%           
==============================================================
  Files                                  ?       16           
  Lines                                  ?     1775           
  Branches                               ?      378           
==============================================================
  Hits                                   ?     1572           
  Misses                                 ?      134           
  Partials                               ?       69           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@msto msto force-pushed the ms_metric-writer branch from 9677b57 to 5113f00 Compare June 6, 2024 00:15
@@ -466,3 +481,240 @@ def asdict(metric: Metric) -> Dict[str, Any]:
"The provided metric is not an instance of a `dataclass` or `attr.s`-decorated Metric "
f"class: {metric.__class__}"
)


class MetricWriter(Generic[MetricType], AbstractContextManager):
Copy link
Contributor Author

@msto msto Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO docs

(possibly in one more PR after the API is finalized. These are all landing in the feature branch #123 , updates -including docs - can be added to that branch before a final merge into main)

fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
fgpyo/util/metric.py Outdated Show resolved Hide resolved
@msto msto force-pushed the ms_metric-writer branch from 007fb12 to 3a4ce99 Compare June 6, 2024 10:57
@msto msto force-pushed the ms_metric-writer branch 3 times, most recently from c6615d9 to 43f2f3e Compare June 6, 2024 13:34
@msto msto changed the base branch from ms_metric-writer-feature-branch to ms_metric-writer-assertions June 6, 2024 13:35
@msto msto force-pushed the ms_metric-writer branch from 43f2f3e to 8a6db00 Compare June 6, 2024 13:35
@msto msto marked this pull request as ready for review June 6, 2024 13:46

Raises:
ValueError: If the provided file does not include a header.
ValueError: If the header of the provided file does not match the provided Metric.
ValueError: If the header of the provided file does not match the provided Metric (or list
of ordered fieldnames, if provided).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. or -> and
  2. where does ordered_fieldnames get checked?

* All fieldnames specified in `include_fields` must be fields on `metric_class`. If this
argument is specified, fields will be returned in the order they appear in the list.
* All fieldnames specified in `exclude_fields` must be fields on `metric_class`. (This is
technically unnecessary, but is a safeguard against passing an incorrect list.)
Copy link
Member

@nh13 nh13 Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how strongly do we feel about this safeguard?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@msto msto force-pushed the ms_metric-writer-assertions branch 2 times, most recently from 0fd747b to beee947 Compare June 6, 2024 16:00
Copy link
Member

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few outstanding comments, as well as I think we want to make write accept one or more metrics

"""Close the underlying file handle."""
self._fout.close()

def write(self, metric: Metric) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def write(self, metric: Metric) -> None:
def write(self, *metric: Metric) -> None:

why not make this varargs? Can the return be a MetricType? For both, see:

def write(cls, path: Path, *values: MetricType) -> None:


self._writer.writerow(row)

def writeall(self, metrics: Iterable[Metric]) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed if we change write?

* All fieldnames specified in `include_fields` must be fields on `metric_class`. If this
argument is specified, fields will be returned in the order they appear in the list.
* All fieldnames specified in `exclude_fields` must be fields on `metric_class`. (This is
technically unnecessary, but is a safeguard against passing an incorrect list.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@msto msto self-assigned this Sep 12, 2024
msto added a commit that referenced this pull request Oct 17, 2024
This PR introduces several of the assertion methods used when
constructing the `MetricWriter`. (See #107 for how they are used in
practice).
Base automatically changed from ms_metric-writer-assertions to ms_metric-writer-feature-branch October 17, 2024 16:48
wip

wip

wip

refactor: reorder

refactor: types

fix: PR suggestions

fix: typeerror

fix: format values

chore: typeguard

refactor: make assertions private

fix: Union and Optional to support 3.8,3.9

doc: update dataclass/Metric references

fix: use Type to support 3.8

fix: use List to support 3.8
@msto msto force-pushed the ms_metric-writer branch from 8a6db00 to 5ad5dfa Compare October 17, 2024 16:51
@msto msto merged commit 1077e14 into ms_metric-writer-feature-branch Oct 17, 2024
6 of 7 checks passed
@msto msto deleted the ms_metric-writer branch October 17, 2024 16:51
msto added a commit that referenced this pull request Oct 17, 2024
This PR introduces several of the assertion methods used when
constructing the `MetricWriter`. (See #107 for how they are used in
practice).
msto added a commit that referenced this pull request Oct 17, 2024
Closes #88.

I've adapted the `DataclassWriter` from
https://github.com/msto/dataclass_io/ to work with Metrics.
msto added a commit that referenced this pull request Oct 17, 2024
This PR introduces several of the assertion methods used when
constructing the `MetricWriter`. (See #107 for how they are used in
practice).
msto added a commit that referenced this pull request Oct 17, 2024
Closes #88.

I've adapted the `DataclassWriter` from
https://github.com/msto/dataclass_io/ to work with Metrics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MetricWriter to permit streamed writing of metrics
3 participants