-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transforms module with scale function #384
Add transforms module with scale function #384
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #384 +/- ##
=======================================
Coverage 99.80% 99.80%
=======================================
Files 14 15 +1
Lines 1025 1048 +23
=======================================
+ Hits 1023 1046 +23
Misses 2 2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your first movement
contribution, @stellaprins!
This is really well done and thoroughly tested. I do have an alternative suggestion for the implementation, though:
- I think the
scale
function (and any future linear transforms of this kind) should only work on data arrays with aspace
dimension (with Cartesian coordinates), and broadcasting should happen only along that dimension. Please see my specific comments for details. - I believe the new attribute should be called
space_unit
, mirroring the existingtime_unit
attribute we populate when loading a dataset. In the future, I’m inclined to merge these two into an attribute namedunits
that accepts a dictionary mapping dimension names to units (as you mentioned in your PR description). However, that should be handled in a separate issue/PR, possibly in conjunction with thepint-xarray
issue. For now, renamingunit
tospace_unit
is perfectly fine.
On the same topic, since we are introducing a new attribute, I wonder if it would be worth populating it directly when a dataset is loaded from a file, as we do for time_unit
. For most of our supported formats, space_unit
would be "pixels"
, with the possible exception of "Anipose"
(I need to double-check). I have opened an issue to keep track of this idea.
… factor shape (instead of length)
…f timepoints, individuals, and keypoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only have two remaining minor comments.
I think we can get rid of a paragraph in the docstring, and I think we should add a simple test to confirm that scaling data arrays with 3d space (x,y,z) works (it indeed does).
I'm pre-approving this, so feel free to merge once you've dealt with these comments.
…ins/movement into sp/366-transforms-scale
This reverts commit 7d5423d.
Quality Gate passedIssues Measures |
@stellaprins It looks like there's a GH-wide issue with pull requests, which is why this has been kicked back out of the merge queue (even though it passes the tests!). Once the status clears (likely not till tomorrow) try hitting the button again! |
41942e1
Description
What is this PR
Why is this PR needed?
Enable users to convert data arrays expressed in pixels to SI units (like meters). Users can scale data to a known reference size (e.g. the size of a cage) and add appropriate units. This allows distances to be represented in standard units instead of just the number of pixels.
What does this PR do?
Adds a transforms module with a scale function. The scale function scales data (
xarray.DataArray
) by a given factor with an optional unit (str
|None
). Units can by any strings (e.g. "elephants") and will be added toxarray.DataArray.attrs["unit"]
. PassingNone
as a unit (explicitely or by default) "dequantifies" the data (i.e. drops.attrs["unit"]
).I've looked at pint-array but as it stands units are simply strings. Another option using
pint-array
to what @niksirbi mentioned in #141 (i.e. something likedata.pint.quantify({'distance': 'metres', 'time': 'seconds'}
), could be to use the.attrs['units']
entry for each data variable (see below).Unit-aware arithmetic in Xarray, via pint
References
part of #366.
How has this PR been tested?
The scale function has been tested with various unit tests to ensure it works correctly. These tests include:
Is this a breaking change?
No.
Does this PR require an update to the documentation?
Docstrings have been added to the module and all functions. No further documentation needed.
Checklist: