Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objectify everything #19

Open
hdrake opened this issue Oct 13, 2021 · 1 comment
Open

Objectify everything #19

hdrake opened this issue Oct 13, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@hdrake
Copy link
Collaborator

hdrake commented Oct 13, 2021

Most of the processing currently consists of functions that take in a xr.DataArray and output a processed xr.DataArray. This process is not very consistent: in some cases the input DataArray is itself is modified and then returned; in other cases a new DataArray is created, filled with modified data from the original DataArray, and then returned. This makes it really difficult to track what is going on at each step of the process.

I recommend we make a child class ctdArray(xr.DataArray) that inherits all of the nice properties of xr.DataArrayss but for which we can define additional methods. Then, each of the processing steps will be a method of that class, e.g. cleanup(self).

A co-benefit of this approach is that all of the meta-data can be directly stored in the data attributes and so the output will contain a record of the exact parameter values and function calls used to produce the output (basically a "log" within the data file itself).

I'll see if I can get some version of this working in the next few days. It will probably be a breaking change but one that is definitely worth it.

@hdrake hdrake self-assigned this Oct 13, 2021
@hdrake hdrake added the enhancement New feature or request label Oct 13, 2021
@gunnarvoet
Copy link
Owner

Citing @hdrake from #20 on why working with xarrays accessors is somewhat impractical:

Just a note that I initially had tried to implement this by defining CTDset as an extension (or "accessor") of xarray objects, but it was too unwieldy to make it worth it. Specifically, the accessors only had access to a shallow copy of the dataset ds via self._obj (i.e. in ds.ctd._obj). But when I tried to define methods that would modify the data, it only modified the attribute self._obj and not the dataset itself ds. I think this is a fundamental limitation of xarray accessors and not a mistake on my part, and would make this path prohibitive. This is unfortunate, because it would be really nice if CTDset had the functionality of a subclass of xr.Dataset and could define its own methods (see also #19).

That said, there are still a lot of benefits of xr.Datasets as a data structure to base the package on, even without being able to define our own class and methods (functions will do just fine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants