Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ok, so I made a few changes in the sofar_nrt files.
Note, the Aqualink token can be ignored, as it should be already implemented in a previous merge
adding a try/except in the main > that allows to skip spotters when something is wrong.
adding a series of conditions in the process_wave_source_id > they allow to skip if the spotters are not in the right coordinates, if deployment date missing, api token missing, spotter ids or locations are duplicated (to remove old deployments), ...
The main thing with that is coordinates. The code checks for outliers (coordinates on api within 0.1-0.02 deg of metadata) and errors (>0.1 deg). The values can be changed.
If there is an outlier (no matter the length) in the data but the spotter is back in position, it downloads the data and warns about it.
If there is an outlier at the end of the data, the data is downloaded up till the first outlier (and coded to start again once the spotter is back in place).
If there is an error in coordinates, if it is a temporary one (<7 days), it considers that the spotter was moved to shore and they forgot to switch it off, no problem.
If the last data point is an error, it warns that it is so, and download up until the error.
If the error in coordinates persists more than a week, it warns to check the coordinates with the facility.
(The choice of 1 week was due to a few spotters that were moved for 5 days, then placed back).
Every time, the data up until the outlier or error is downloaded, and will restart at a later date if it resolved, or warn to change something in metadata if needed.
messages. Not only the errors and warnings are logged in the log for each spotter, it also collects them all and give a summary at the end of the log, which looks something like that:
2024-07-25 12:27:59.879313 :
Partial data download. The following buoys were skipped due to errors:
'SPOT-0411 ( Collaroy )': 'The location name is duplicated and data was not downloaded for either. Please remove old deployment from metadata.',
'SPOT-1462 ( Apollo Bay )': 'Last data point (2024-06-13 10:05:31) is an outlier (<10km). Data until 2024-06-13 07:05:31 has been downloaded, but download for later data, will only resume when buoy is back at location.',
'SPOT-1542 ( Mermaid Reef )': 'API error: missing token. Please note that institution names are case sensitive.'
The following warnings were raised:
'SPOT-1632 ( Lakes Entrance )': 'Deployment start date missing. Download of data skipped.',
'SPOT-0312 ( Boambee )': 'No data available BETWEEN 2024-01-01 00:00:00+00:00 AND 2024-02-01 00:00:00+00:00.',
finally the notification.
Devopps decided against setting up a notification system for this, therefore I went with logging in a summary message at the end of each run. If at a later date someone wants to send that message as a notification (slack, email or other), it can be (but needs to be discussed with them on how to do that in prod).
oh yeah, and unitest for duplicates s[potter ids. That requires the metadata in the test folder to be the one used.