Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define necessary data parameter descriptors to be used in STAC definitions for data #7

Open
santilland opened this issue Jan 30, 2023 · 10 comments
Labels
enhancement New feature or request

Comments

@santilland
Copy link
Collaborator

santilland commented Jan 30, 2023

WIP: STAC definition document:
https://docs.google.com/spreadsheets/d/1Rzygo7mt-d5Sb1OtvjTh270sKg-SgFkH-uI8GI2l2-M/edit?usp=sharing

Before deciding on any STAC structure or specific implementation we want to find and define all parameters we require to make sure we can represent the data as we currently do in the dashboard as well as with additional information that is currently missing. As we see in our project as well as others, there is a divide in Catalog data (being more bare bones) and additional data being maintained separately on how to present the data.
We want to try and define all of the parameters that are handled so that we can agree on how they can or should be included in the STAC descriptions.

This first entry will be edited to maintain a list of all the collected inputs provided as comments.

  • name
  • description
  • theme(s): to which overall category does this indicator belong to, could also be considered a tag
  • story: reference to markdown file providing extensive information with media content
  • thumbnail: url to small image
  • preview: url to medium sized image
  • extent: area covered
  • country: when indicator specific to a country
  • city: when indicator specific to a city
  • siteName: when within a city different location
  • data list: list of time entries extracted through various interfaces to define for which dates data is available
  • data endpoint:
    • object describing how to retrieve or visualize the indicator on the map
    • typical WMS/WMTS/Vector Tiles/COG/rendering services
  • citing: a way to describe how to cite when using the source
  • author: some way of describing all contributing authors / institutes / ....
  • license: what license has the data
  • disclaimer: any disclaimer related to the data
  • data sources: maybe list of references to same object being described here
  • accuracy: description of accuracy
  • resolution: when applicable what resolution has the data
  • doi: for dataset
  • colorlegend: object describing list of colors and range to be used
  • satellite-mission, object describing if data is coming directly from satellite mission including:
    • sensor description
    • mission group
    • ...
@santilland
Copy link
Collaborator Author

santilland commented Feb 3, 2023

As the second entry we could try describing what levels of stac we use and where some of the information should be contained:

@santilland
Copy link
Collaborator Author

santilland commented Feb 3, 2023

List of references

Catalogs:

Catalog visualization:

Additional data descriptors:

Related issues:

Technologies:

Others:
Possible consideration to describe tabular data, e.g. geodb:
use summaries to describe table
and/or use table stac extension

OSC:

  • Consider use of OSC extension especially for themes

@j08lue
Copy link

j08lue commented Feb 3, 2023

Great! Can we actually turn this into a table, where we can have some columns like

  1. Name
  2. Description
  3. Exists in VEDA STAC
  4. Exists in Planetary Computer STAC
  5. Requires STAC extension (name/new)
  6. Value for EO Dashboard project - on a scale from 1 (low) to 10 (high value)
  7. Value for VEDA project - on a scale from 1 (low) to 10 (high value)

After some iteration, we can put these properties into a 2x2 matrix to identify the ones that are easy to implement and high-value.

@j08lue
Copy link

j08lue commented Feb 3, 2023

Scientific Citation, for example, already has a stable STAC extension: https://github.com/stac-extensions/scientific - easy to implement

@santilland
Copy link
Collaborator Author

Hello @j08lue as the table is getting a bit more complex then what i think makes sense to manage in github comments i created following document:
https://docs.google.com/spreadsheets/d/1Rzygo7mt-d5Sb1OtvjTh270sKg-SgFkH-uI8GI2l2-M/edit?usp=sharing

Please feel free to request access permissions if you would like to provide inputs there. I think from our side we have a good starting point which we can reference for starting our initial implementation tests for catalog creation, i imagine while working in the creation of the catalog we will find what things works and which don't.
For now the only thing i am missing is a way to describe applied colormaps, i have not seen any extension for that.

@santilland
Copy link
Collaborator Author

Hello @j08lue , i would like to update on some thoughts we are also going to discuss internally. I think the more important discussion (apart of what metadata we save) is the hierarchy we want to use so that the dashboards can be nicely populated.
I have tried to brainstorm the structure a little here and would love to hear your feedback on how this hierarchy would fit the concepts you use for your dashboard:

Image

@santilland
Copy link
Collaborator Author

Initial implementation of generation logic has been started as part of new repository eodash-catalog.
Decisions on properties to include and possible necessary hierarchy will be done in an iterative process while trying to integrate all necessary information for available collection into the STAC catalog

@santilland santilland transferred this issue from eurodatacube/eodash Jul 11, 2023
@j08lue
Copy link

j08lue commented Jul 11, 2023

@santilland, can you briefly describe what the purpose of the eodash-catalog is? Is this some kind of translating / intake service that makes various (STAC) metadata providers compatible with the EO Dashboard?

Will eodash-catalog implement the "Indicator" level that you proposed previously?

We have been discussing different approaches to federated STAC search and connection to dashboards / visualization frontends and are planning on having a dedicated working group on this in the next few months. I hope there will be traction on this subject from the VEDA side then.

While I am sure that a "glue service" or auxiliary data injector will continue to be needed to harmonize various sources, I still have the dream of keeping that as slim as possible and moving more dashboard / visualization related information into STAC instead. Seems like this has been discussed before in the STAC community (re rendering hints, etc)... Would you be interested in us pursuing that direction (more dashboard-friendly STAC), too?

@santilland
Copy link
Collaborator Author

@j08lue the idea is to move the description of (data) content away from the eodash client repository and to allow using the dashboard just by pointing it to a supported STAC catalog. This way the instantiation of the eodash client is greatly simplified and the data provided by the different instances can more easily be integrated into other clients.

Additionally it simplifies how user/expert contributed data can be integrated into the eodash client instances.
Initially it is a translating/intake service for including also external catalogs that do not have the required information, but with the hope that we can as much as possible directly integrate other sources, without the need of translation.
The step is not only for translation but also for configuration of what data should actually be included, so for example, the catalog generated can point to external collections, but we still need a dedicated catalog because we usually do not want to integrate entire external catalogs, only specific collections or we want to subset in very specific ways.

For now our approach to define visualizations is to use the web-map-links extension.
We did some test for data we include in our dashboard from the VEDA endpoint, for example for the yearly NO2, by adding the web-map-link to the items it already can be visualized by the standard stac browser:
https://radiantearth.github.io/stac-browser/#/external/eurodatacube.github.io/eodash-catalog/trilateral/catalog.json
You can navigate to individual items and get a visualization on the map.

We are very much interested in pursuing that direction with you, that is why i am also trying to describe our process here :)

@santilland
Copy link
Collaborator Author

As for the indicator level this is something that is still under discussion, i see three approaches:

  1. It is a special configuration that can be done on the client (for example describing which collections should be grouped into indicators) - don't necessarily want this in the client configuration
  2. or it is a special type of collection, which makes things a bit complicated as in stac we expect the same "granularity" when navigating a level, so if some special collections (indicators) references other collections then it would not be the same granularity for "plain" collections that then point to items
  3. or we add a new "level" where normally each indicator points to one collection, but for more complex indicators references to multiple collections. In this case the catalog would then point to indicators and not directly to collections

This still leaves some headaches about on which level which information should exist. Does an indicator describe the data collections it references? Or do you need to crawl the referenced collections to gather the data that describes the indicator, and so on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

2 participants