Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compressed UN JSON file #44

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

ndrezn
Copy link

@ndrezn ndrezn commented Nov 28, 2024

This adds in the GeoJSON file from https://geoportal.un.org/arcgis/apps/sites/#/geohub/datasets/d7caaff3ef4b4f7c82689b7c4694ad92/about, provided by the United Nations.

To generate this file I have manually:

  • Downloaded the GeoJSON
  • Compressed to 20% using MapShaper

I think the right way to do this would be to add in a new script to bin/ which allows us to scrape from UN directly and calls the Mapshaper API to generate the built files.

However, our real goal is just to test whether this file works with Plotly.js in the first place, so making a placeholder PR with the right dataset and reference data.

Though, after digging into this project a bit, I'm worried that this is out-of-scope. All of the GeoJSON files in this repository come from Natural Earth, which we're trying to move away from in Plotly.js:

  • Would it make more sense to structure this project as a complete migration and regenerate all relevant files using the UN dataset?
    • Should we leave sane-topojson alone and make a new package..? Or even bundle the UN files directly into Plotly.js?
  • I also notice that world_110m and world_50m both contain borders for US states and Canadian provinces. We might need to add that data to the UN GeoJSON?
  • Do we need to also change the format/keys in the UN data to match what Plotly expects? Probably... I believe iso3cd in the UN data corresponds to id in the Natural Earth data. Most other metadata I think we can throw out.

@ndrezn ndrezn marked this pull request as draft November 28, 2024 16:18
@ndrezn
Copy link
Author

ndrezn commented Nov 28, 2024

I am going to treat this as a draft PR for now because there are a few open questions. But at least the dataset we want to start with is included here so we can test to verify compatibility with Plotly.js.

@etpinard
Copy link
Owner

Hi @ndrezn - thanks for taking this on!

  • Do we need to also change the format/keys in the UN data to match what Plotly expects? Probably... I believe iso3cd in the UN data corresponds to id in the Natural Earth data

The goal of sane-topojson was to generate the most minimal topojson output files for the plotly.js geo subplot type. So yeah every piece of metadata in the sane-topojson outputs should at the moment be used to some degree in the plotly.js geo subplot type.

  • I also notice that world_110m and world_50m both contain borders for US states and Canadian provinces. We might need to add that data to the UN GeoJSON?

There's a plotly.js geo subplot attribute that toggles subdivisions. Dropping US states and Canadian provinces borders would lead to breaking changes in some plotly.js graphs.

  • Should we leave sane-topojson alone and make a new package..? Or even bundle the UN files directly into Plotly.js?

That will be your call 😄

@maxmalynowsky
Copy link

maxmalynowsky commented Jan 17, 2025

Hey there, I work at the Centre for Humanitarian Data, and I can help provide some guidance on utilizing this boundary file. There are some caveats with the raw data you linked to above.

  1. This file contains many overlapping geometries which need removing, specifically the broader geographic regions and great lakes. These can easily be identified as all rows with iso3cd == NULL.
  2. There is an overlapping geometry for Hawaii which needs to be removed, identified with globalid == {8B42E894-6AF5-4236-B04D-8F634A159724}
  3. There are many topological issues needing cleaning which can help optimize the MapShaper simplification. They can be resolved with QGIS's v.clean tool, using a v.in.ogr snap tolerance of 0.00001. Note that this will generate some geometry collections with lines and polygons, the lines needing to be removed.

The end result should give you this file here: UN_Geodata_simplified_cleaned.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants