Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RaggedArray from numerical #40

Open
selipot opened this issue Oct 13, 2022 · 3 comments
Open

RaggedArray from numerical #40

selipot opened this issue Oct 13, 2022 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@selipot
Copy link
Member

selipot commented Oct 13, 2022

I am following the example dataformat-numerical.ipynb to convert the output of an ocean parcels simulation to a ragged array and save to a NetCDF file but I do not understand how the time variable is handled and/or if the units can be specified. The NetCDF file written by parcels contain the variable time in units of seconds since a pivot date but the NetCDF file written by clouddrift after converting to a ragged array seems to be in minutes since the origin of the experiment. I dug through dataformat.py to understand but could not figure it out.

@milancurcic
Copy link
Member

I'll play with it and let you know what I find.

@milancurcic milancurcic self-assigned this Oct 13, 2022
@selipot
Copy link
Member Author

selipot commented Oct 19, 2022

The latest version of ocean parcels now outputs in zarr format, see https://github.com/OceanParcels/parcels/releases/tag/v2.4.0. It is a priority to write a new recipe that takes such zarr output (still written as a sparse 2D array) into a RaggedArray. We also should add a functionality to output the RaggedArray to zarr with RaggedArray.to_zarr()

@philippemiron
Copy link
Contributor

philippemiron commented Oct 25, 2022

I am following the example dataformat-numerical.ipynb to convert the output of an ocean parcels simulation to a ragged array and save to a NetCDF file but I do not understand how the time variable is handled and/or if the units can be specified. The NetCDF file written by parcels contain the variable time in units of seconds since a pivot date but the NetCDF file written by clouddrift after converting to a ragged array seems to be in minutes since the origin of the experiment. I dug through dataformat.py to understand but could not figure it out.

When you open the netCDF with decode_times=False, you get the array of "offsets" directly. In that example, I then set the time attributes as: 'long_name': 'Time in days', 'units': 'days since 2021-01-01'. The 'units' is recognized later on by the NetCDF library to convert back the time if needed.

For the data used in the example Notebook:
Screen Shot 2022-10-24 at 21 42 49

But as I said in the top of the Notebook, the format is very close to the output format of Ocean Parcels and OpenDrift, so with a Parcels file the origin might be different. I don't remember now if by default it is a constant origin or it is set to the start of the experiment.

@kevinsantana11 kevinsantana11 added the question Further information is requested label Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants