-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⭐️ IBTRACS dataset adapter #493
base: main
Are you sure you want to change the base?
Conversation
Associated to #493 and fixes the deployed dataset in s3 repo. The code changes update the dataset URL in the `gdp1h` function to version 2.01.1. This ensures that the latest version of the dataset is being used. The previous URL was "https://noaa-oar-hourly-gdp-pds.s3.amazonaws.com/latest/gdp-v2.01.zarr" and it has been updated to "https://noaa-oar-hourly-gdp-pds.s3.amazonaws.com/latest/gdp-v2.01.1.zarr".
Is it normal that the tests take 40 minutes? |
Not really but sometimes the AOML servers can get overloaded and the exponential backoff will kick in and sometimes cause tests to take a really long time if the server can't recover in time. |
25ce09a
to
bc000cc
Compare
@selipot this PR is ready for review. I've also got started on the example notebook repo for the dataset: |
clouddrift/datasets.py
Outdated
xarray.Dataset | ||
IBTRACS dataset as a ragged array. | ||
|
||
Standard usage of the dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line? Should we add a simple example ds = ibtracs()
?
I am not able to generate version v03r09. I get the following error
|
clouddrift/datasets.py
Outdated
|
||
Parameters | ||
---------- | ||
version : "v03r09", "v04r00", "v04r01" (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop support of version 3
clouddrift/datasets.py
Outdated
---------- | ||
version : "v03r09", "v04r00", "v04r01" (default) | ||
Specify the dataset version to retrieve. Default to the latest version. | ||
kind: "ACTIVE", "ALL", "EP", "NA", "NI", "SA", "SI", "SP", "WP", "SINCE_1980", "LAST_3_YEARS" (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does using "ACTIVE" or "LAST_3_YEARS" re-generate the ragged array or not? I think it should. So maybe disable caching for this dataset?
@kevinsantana11 what else do we need to do to merge? |
Add docstring to the |
8f1bfb1
to
b814b06
Compare
Found some issues when comparing the ragged and un-ragged dataset. Still need to do further investigation but I've added a test that should pass once the issue is fixed. |
So upon further investigation it turns out there isn't an issue between the ragged and original dataset. Last night I had noticed discrepancies between the length of the data variables but later realized the original datasets data variables span the whole datasets observation datetime span. After comparing the variables using the trimmed length of the original array the test passes since both the ragged and original data array contain the same data. I added this validation to the tests and also added a check which validates the rest of the variable only contains nan values. |
No description provided.