Skip to content

Commit

Permalink
Drop the .move accessor (#322)
Browse files Browse the repository at this point in the history
* replace accessor with MovementDataset dataclass

* moved pre-save validations inside save_poses module

* deleted accessor code and associated tests

* define dataset structure in modular classes

* updated stale docstring for _validate_dataset()

* remove mentions of the accessor from the getting started guide

* dropped accessor use in examples

* ignore linkcheck for opensource licenses

* Revert "ignore linkcheck for opensource licenses"

This reverts commit c8f3498.

* use ds.sizes instead of ds.dims to suppress warning

* Add references

* remove movement_dataset.py module

---------

Co-authored-by: lochhh <[email protected]>
  • Loading branch information
niksirbi and lochhh authored Oct 24, 2024
1 parent 6c32608 commit f7f3b48
Show file tree
Hide file tree
Showing 18 changed files with 168 additions and 690 deletions.
58 changes: 24 additions & 34 deletions docs/source/getting_started/movement_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ For example, you can:
[data aggregation and broadcasting](xarray:user-guide/computation.html), and
- use `xarray`'s built-in [plotting methods](xarray:user-guide/plotting.html).

As an example, here's how you can use the `sel` method to select subsets of
As an example, here's how you can use {meth}`xarray.Dataset.sel` to select subsets of
data:

```python
Expand Down Expand Up @@ -223,55 +223,45 @@ position = ds.position.sel(
) # the output is a data array
```

### Accessing movement-specific functionality
### Modifying movement datasets

`movement` extends `xarray`'s functionality with a number of convenience
methods that are specific to `movement` datasets. These `movement`-specific methods are accessed using the
`move` keyword.
Datasets can be modified by adding new **data variables** and **attributes**,
or updating existing ones.

For example, to compute the velocity and acceleration vectors for all individuals and keypoints across time, we provide the `move.compute_velocity` and `move.compute_acceleration` methods:
Let's imagine we want to compute the instantaneous velocity of all tracked
points and store the results within the same dataset, for convenience.

```python
velocity = ds.move.compute_velocity()
acceleration = ds.move.compute_acceleration()
```
from movement.analysis.kinematics import compute_velocity

The `movement`-specific functionalities are implemented in the
{class}`movement.move_accessor.MovementDataset` class, which is an [accessor](https://docs.xarray.dev/en/stable/internals/extending-xarray.html) to the
underlying {class}`xarray.Dataset` object. Defining a custom accessor is convenient
to avoid conflicts with `xarray`'s built-in methods.
# compute velocity from position
velocity = compute_velocity(ds.position)
# add it to the dataset as a new data variable
ds["velocity"] = velocity

### Modifying movement datasets
# we could have also done both steps in a single line
ds["velocity"] = compute_velocity(ds.position)

The `velocity` and `acceleration` produced in the above example are {class}`xarray.DataArray` objects, with the same **dimensions** as the
original `position` **data variable**.
# we can now access velocity like any other data variable
ds.velocity
```

In some cases, you may wish to
add these or other new **data variables** to the `movement` dataset for
convenience. This can be done by simply assigning them to the dataset
with an appropriate name:
The output of {func}`movement.analysis.kinematics.compute_velocity` is an {class}`xarray.DataArray` object,
with the same **dimensions** as the original `position` **data variable**,
so adding it to the existing `ds` makes sense and works seamlessly.

```python
ds["velocity"] = velocity
ds["acceleration"] = acceleration
We can also update existing **data variables** in-place, using {meth}`xarray.Dataset.update`. For example, if we wanted to update the `position`
and `velocity` arrays in our dataset, we could do:

# we can now access these using dot notation on the dataset
ds.velocity
ds.acceleration
```python
ds.update({"position": position_filtered, "velocity": velocity_filtered})
```

Custom **attributes** can also be added to the dataset:
Custom **attributes** can be added to the dataset with:

```python
ds.attrs["my_custom_attribute"] = "my_custom_value"

# we can now access this value using dot notation on the dataset
ds.my_custom_attribute
```

We can also update existing **data variables** in-place, using the `update()` method. For example, if we wanted to update the `position`
and `velocity` arrays in our dataset, we could do:

```python
ds.update({"position": position_filtered, "velocity": velocity_filtered})
```
45 changes: 21 additions & 24 deletions examples/compute_kinematics.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,27 +117,22 @@
# %%
# Compute displacement
# ---------------------
# The :mod:`movement.analysis.kinematics` module provides functions to compute
# various kinematic quantities,
# such as displacement, velocity, and acceleration.
# We can start off by computing the distance travelled by the mice along
# their trajectories.
# For this, we can use the ``compute_displacement`` method of the
# ``move`` accessor.
displacement = ds.move.compute_displacement()
# their trajectories:

# %%
# This method will return a data array equivalent to the ``position`` one,
# but holding displacement data along the ``space`` axis, rather than
# position data.

# %%
# Notice that we could also compute the displacement (and all the other
# kinematic variables) using the :mod:`movement.analysis.kinematics` module:

# %%
import movement.analysis.kinematics as kin

displacement_kin = kin.compute_displacement(position)
displacement = kin.compute_displacement(position)

# %%
# The :func:`movement.analysis.kinematics.compute_displacement`
# function will return a data array equivalent to the ``position`` one,
# but holding displacement data along the ``space`` axis, rather than
# position data.
#
# The ``displacement`` data array holds, for a given individual and keypoint
# at timestep ``t``, the vector that goes from its previous position at time
# ``t-1`` to its current position at time ``t``.
Expand Down Expand Up @@ -271,13 +266,14 @@
# ----------------
# We can easily compute the velocity vectors for all individuals in our data
# array:
velocity = ds.move.compute_velocity()
velocity = kin.compute_velocity(position)

# %%
# The ``velocity`` method will return a data array equivalent to the
# ``position`` one, but holding velocity data along the ``space`` axis, rather
# than position data. Notice how ``xarray`` nicely deals with the different
# individuals and spatial dimensions for us! ✨
# The :func:`movement.analysis.kinematics.compute_velocity`
# function will return a data array equivalent to
# the ``position`` one, but holding velocity data along the ``space`` axis,
# rather than position data. Notice how ``xarray`` nicely deals with the
# different individuals and spatial dimensions for us! ✨

# %%
# We can plot the components of the velocity vector against time
Expand Down Expand Up @@ -350,8 +346,9 @@
# %%
# Compute acceleration
# ---------------------
# We can compute the acceleration of the data with an equivalent method:
accel = ds.move.compute_acceleration()
# Let's now compute the acceleration for all individuals in our data
# array:
accel = kin.compute_acceleration(position)

# %%
# and plot of the components of the acceleration vector ``ax``, ``ay`` per
Expand All @@ -375,8 +372,8 @@
fig.tight_layout()

# %%
# The can also represent the magnitude (norm) of the acceleration vector
# for each individual:
# We can also compute and visualise the magnitude (norm) of the
# acceleration vector for each individual:
fig, axes = plt.subplots(3, 1, sharex=True, sharey=True)
for mouse_name, ax in zip(accel.individuals.values, axes, strict=False):
# compute magnitude of the acceleration vector for one mouse
Expand Down
102 changes: 33 additions & 69 deletions examples/filter_and_interpolate.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
# Imports
# -------
from movement import sample_data
from movement.analysis.kinematics import compute_velocity
from movement.filtering import filter_by_confidence, interpolate_over_time

# %%
# Load a sample dataset
Expand Down Expand Up @@ -73,35 +75,19 @@
# %%
# Filter out points with low confidence
# -------------------------------------
# Using the
# :meth:`filter_by_confidence()\
# <movement.move_accessor.MovementDataset.filtering_wrapper>`
# method of the ``move`` accessor,
# we can filter out points with confidence scores below a certain threshold.
# The default ``threshold=0.6`` will be used when ``threshold`` is not
# provided.
# This method will also report the number of NaN values in the dataset before
# and after the filtering operation by default (``print_report=True``).
# Using the :func:`movement.filtering.filter_by_confidence` function from the
# :mod:`movement.filtering` module, we can filter out points with confidence
# scores below a certain threshold. This function takes ``position`` and
# ``confidence`` as required arguments, and accepts an optional ``threshold``
# parameter, which defaults to ``threshold=0.6`` unless specified otherwise.
# The function will also report the number of NaN values in the dataset before
# and after the filtering operation by default, but you can disable this
# by passing ``print_report=False``.
#
# We will use :meth:`xarray.Dataset.update` to update ``ds`` in-place
# with the filtered ``position``.

ds.update({"position": ds.move.filter_by_confidence()})

# %%
# .. note::
# The ``move`` accessor :meth:`filter_by_confidence()\
# <movement.move_accessor.MovementDataset.filtering_wrapper>`
# method is a convenience method that applies
# :func:`movement.filtering.filter_by_confidence`,
# which takes ``position`` and ``confidence`` as arguments.
# The equivalent function call using the
# :mod:`movement.filtering` module would be:
#
# .. code-block:: python
#
# from movement.filtering import filter_by_confidence
#
# ds.update({"position": filter_by_confidence(position, confidence)})
ds.update({"position": filter_by_confidence(ds.position, ds.confidence)})

# %%
# We can see that the filtering operation has introduced NaN values in the
Expand All @@ -120,36 +106,16 @@
# %%
# Interpolate over missing values
# -------------------------------
# Using the
# :meth:`interpolate_over_time()\
# <movement.move_accessor.MovementDataset.filtering_wrapper>`
# method of the ``move`` accessor,
# we can interpolate over the gaps we've introduced in the pose tracks.
# Using the :func:`movement.filtering.interpolate_over_time` function from the
# :mod:`movement.filtering` module, we can interpolate over gaps
# we've introduced in the pose tracks.
# Here we use the default linear interpolation method (``method=linear``)
# and interpolate over gaps of 40 frames or less (``max_gap=40``).
# The default ``max_gap=None`` would interpolate over all gaps, regardless of
# their length, but this should be used with caution as it can introduce
# spurious data. The ``print_report`` argument acts as described above.

ds.update({"position": ds.move.interpolate_over_time(max_gap=40)})

# %%
# .. note::
# The ``move`` accessor :meth:`interpolate_over_time()\
# <movement.move_accessor.MovementDataset.filtering_wrapper>`
# is also a convenience method that applies
# :func:`movement.filtering.interpolate_over_time`
# to the ``position`` data variable.
# The equivalent function call using the
# :mod:`movement.filtering` module would be:
#
# .. code-block:: python
#
# from movement.filtering import interpolate_over_time
#
# ds.update({"position": interpolate_over_time(
# position_filtered, max_gap=40
# )})
ds.update({"position": interpolate_over_time(ds.position, max_gap=40)})

# %%
# We see that all NaN values have disappeared, meaning that all gaps were
Expand All @@ -176,27 +142,25 @@
# %%
# Filtering multiple data variables
# ---------------------------------
# All :mod:`movement.filtering` functions are available via the
# ``move`` accessor. These ``move`` accessor methods operate on the
# ``position`` data variable in the dataset ``ds`` by default.
# There is also an additional argument ``data_vars`` that allows us to
# specify which data variables in ``ds`` to filter.
# When multiple data variable names are specified in ``data_vars``,
# the method will return a dictionary with the data variable names as keys
# and the filtered DataArrays as values, otherwise it will return a single
# DataArray that is the filtered data.
# This is useful when we want to apply the same filtering operation to
# We can also apply the same filtering operation to
# multiple data variables in ``ds`` at the same time.
#
# For instance, to filter both ``position`` and ``velocity`` data variables
# in ``ds``, based on the confidence scores, we can specify
# ``data_vars=["position", "velocity"]`` in the method call.
# As the filtered data variables are returned as a dictionary, we can once
# again use :meth:`xarray.Dataset.update` to update ``ds`` in-place
# in ``ds``, based on the confidence scores, we can specify a dictionary
# with the data variable names as keys and the corresponding filtered
# DataArrays as values. Then we can once again use
# :meth:`xarray.Dataset.update` to update ``ds`` in-place
# with the filtered data variables.

ds["velocity"] = ds.move.compute_velocity()
filtered_data_dict = ds.move.filter_by_confidence(
data_vars=["position", "velocity"]
)
ds.update(filtered_data_dict)
# Add velocity data variable to the dataset
ds["velocity"] = compute_velocity(ds.position)

# Create a dictionary mapping data variable names to filtered DataArrays
# We disable report printing for brevity
update_dict = {
var: filter_by_confidence(ds[var], ds.confidence, print_report=False)
for var in ["position", "velocity"]
}

# Use the dictionary to update the dataset in-place
ds.update(update_dict)
Loading

0 comments on commit f7f3b48

Please sign in to comment.