Deprecate `countries_regions.csv` by the new regions dataset #1081

Marigold · 2023-05-05T07:25:30Z

@pabloarosado did a great job creating a new regions dataset that resembles a typical dataset. This dataset will soon be used by grapher, and we'll finally have a single source of truth for all regions... except for countries_regions.csv. That file still resides in ETL and supports numerous helper functions and datasets. It's starting to cause headaches because it's not 100% consistent with the regions dataset.

We should attempt to remove it from ETL if we don't encounter any major obstacles.

(I wasn't sure whether we already have an issue for this)

Potential issues

Need to define data://garden/regions/2023-01-01/regions dependency for each step.
Adding alias to regions.yml will trigger update of all datasets that depend on it. That's quite wasteful.

Solution to both would be to make regions dataset implicit dependency of all steps and ignore its checksum. Any updates to regions.yml would have to be followed by manual trigger of ETL (we could have explicit version regions.yml, e.g. 1.2.3 and increment it if we manually update it. That version would be then part of checksum just like pandas version is).

The text was updated successfully, but these errors were encountered:

github-actions bot added the needs triage label May 5, 2023

Marigold linked a pull request May 15, 2023 that will close this issue

🔨 Deprecate countries_regions.csv #1111

Merged

Marigold mentioned this issue May 17, 2023

🔨 Deprecate countries_regions.csv #1111

Merged

Marigold closed this as completed in #1111 May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate `countries_regions.csv` by the new regions dataset #1081

Deprecate `countries_regions.csv` by the new regions dataset #1081

Marigold commented May 5, 2023 •

edited

Loading

Deprecate countries_regions.csv by the new regions dataset #1081

Deprecate countries_regions.csv by the new regions dataset #1081

Comments

Marigold commented May 5, 2023 • edited Loading

Potential issues

Deprecate `countries_regions.csv` by the new regions dataset #1081

Deprecate `countries_regions.csv` by the new regions dataset #1081

Marigold commented May 5, 2023 •

edited

Loading