Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basemapper allow for in memory BytesIO GeoJSON for boundary param #204

Closed
spwoodcock opened this issue Oct 13, 2023 · 21 comments · Fixed by #261 · May be fixed by Munanom/osm-fieldwork#1 or Gunjan-Goyal/osm-fieldwork#1
Closed

Basemapper allow for in memory BytesIO GeoJSON for boundary param #204

spwoodcock opened this issue Oct 13, 2023 · 21 comments · Fixed by #261 · May be fixed by Munanom/osm-fieldwork#1 or Gunjan-Goyal/osm-fieldwork#1
Labels
enhancement New feature or request outreachy Project tasks for Outreachy internship - May to August 2023 Priority: Nice to have

Comments

@spwoodcock
Copy link
Member

spwoodcock commented Oct 13, 2023

Outreachy Task

MBTile Basemaps

  • MBTiles is a format for storing map tiles offline in a single file.
  • The basemapper.py module in this repo is used for take a project area and generate an MBTiles file from various imagery sources.
  • These sources can be OpenStreetMap, ESRI, Google, a custom Tile Map Server, etc.

The Current Implementation

  • One of the parameters for basemapper.py is bbox.
  • It can be passed as string, i.e. coordinates in format x_min, y_min, x_max, y_max:
-4.730494,41.650541,-4.725634,41.652874
  • Or as a GeoJSON file on disk:
{
"type": "FeatureCollection",
"name": "New_Kru_Town",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [{
  "type": "Feature",
  "properties": { "id": 0, "country": "Liberia", "area": 1.7e-05, "perimeter": 0.016581 },
  "geometry": { "type": "Polygon", "coordinates": [ [ [ -10.7938424154191, 6.36566446544188 ], [ -10.795144938864549, 6.369300019093447 ], [ -10.79320532649367, 6.370264783563053 ], [ -10.791896296184481, 6.370164711508116 ], [ -10.79146494532514, 6.369379587575136 ], [ -10.789732430818949, 6.366126169489173 ], [ -10.78956466274909, 6.365788466299136 ], [ -10.78991618820425, 6.365746651092064 ], [ -10.790463189266021, 6.365722411326678 ], [ -10.7938424154191, 6.36566446544188 ] ] ] } }
]}

The Issue

  • The GeoJSON file must exist on your filesystem.
  • In some cases we wish to pass the GeoJSON file to basemapper.py in memory (i.e. in Python this is a bytes object).
  • It would be nice to add support for in memory GeoJSON passing (via bytes wrapped in a BytesIO wrapper).

The Solution

  • In memory GeoJSON should be the default format for bbox passing, with additional support for bbox string format.
  • The support for parsing GeoJSON files on disk should ideally be removed from the BaseMapper class and placed inside the main() function that is only used when running basemapper via the command line. The file should be read and converted to BytesIO object, before passing through to the create_basemap_file function.

Instructions

Set up your repo

  • Fork the osm-fieldwork repository: https://github.com/hotosm/osm-fieldwork/fork

  • Clone the forked repository to your filesystem and create a new branch:

    git clone https://github.com/yourusername/osm-fieldwork.git
    git checkout -b feat/basemapper-bytesio-geojson
  • Solve the problem by writing and committing new code.

  • Create a pull request from your new branch to the main branch within your forked repo (please do not create a pull request against the hotosm/osm-fieldwork repo).

Prerequisites

Install PDM:

pip install pdm

Install the dependencies for osm-fieldwork:

# from within your cloned osm-fieldwork repo
pdm install -G test

Testing your code

To save time and overloading networks, I would recommend building a minimal basemap during testing by using the command specified below.

There is a command line interface to basemapper that works like:

pdm run python osm_fieldwork/basemapper.py -b -4.730494 41.650541 -4.725634 41.652874 -z 12-15 -s esri

# Or via geojson file
pdm run python osm_fieldwork/basemapper.py -b yourbbox.geojson -z 12-15 -s esri

As you can see the boundary is passed in using flag -b and can be a bounding box or GeoJSON file.

The goal of this task is to allow for passing the GeoJSON file in-memory when using basemapper via a script.

Using via Python script

Create a file outreachy.py in the repo of your code repository:

from osm_fieldwork.basemapper import create_basemap_file

create_basemap_file(
    verbose=True,
    boundary="-4.730494,41.650541,-4.725634,41.652874",
    outfile="outreachy.mbtiles",
    zooms="12-15",
    source="esri",
)

Run the file to generate a .mbtiles archive:

pdm run outreachy.py

Using via PyTest

This method is slightly more complex, requiring Docker to be installed.
You may skip this step if necessary.

Open the existing tests/test_basemap.py file to see the current test.

Run the test from the code repository root:

docker compose run --rm fieldwork pytest tests/test_basemap.py

A definite bonus would be to write tests for your new code as another function in this file.
The more comprehensive the tests, the better.

Writing your solution

You will need to modify create_basemap_file in basemapper.py so that the boundary parameter accepts a BytesIO object, in addition to a bounding box or geojson file.

Hint: you can determine the type of input for a function using the built in Python isinstance(obj, type).

In the end, we want to be able to use the function like this:

from io import BytesIO
from osm_fieldwork.basemapper import create_basemap_file

with open("/path/to/file.geojson", "rb") as geojson_file:
    boundary = geojson_file.read()  # read as a `bytes` object.
    boundary_bytesio = BytesIO(boundary)   # add to a BytesIO wrapper

create_basemap_file(
    verbose=True,
    boundary=boundary_bytesio,
    outfile="outreachy.mbtiles",
    zooms="12-15",
    source="esri",
)

Important notes on submission

Your code should not be submitted through email, slack, a PR to this repository, or any other means.

  • Please create a pull request on your own forked repository.
    • The pull request should be from the created branch on your fork to the main branch on your fork.
  • Then send me a link to your complete pull request via private message on Slack.
  • Please keep any discussions or questions to the outreachy channel on Slack, so that everyone can benefit.
  • I will not respond to private message, to keep things fair amongst applicants.
  • It would be nice to run the formatter and linter over your generated code to check for issues:
pip install pre-commit
pre-commit run --files osm_fieldwork/basemapper.py

Note any usage of natural language models such as ChatGPT, Github Copilot etc will disqualify you from the internship, as there are potential copyright infringement issues.

FAQ

Issues installing psycopg2

psycopg2 requires libpq-dev to be installed on your system.
In short there are solutions for each OS below:

Ubuntu/Debian

sudo apt update
sudo apt install -y libpq-dev gcc
pdm install -G test

MacOS

brew install libpq gcc
pdm install -G test

Windows

(requires a workaround, try either)

pip install pipwin 
pipwin install psycopg2
pip uninstall pipwin
pdm install -G test

or

pdm add https://download.lfd.uci.edu/pythonlibs/archived/psycopg2-2.9.3-cp310-cp310-win_amd64.whl
pdm install -G test
@spwoodcock spwoodcock added the outreachy Project tasks for Outreachy internship - May to August 2023 label Feb 27, 2024
@ohthebrave
Copy link

Hello, I am an Outreachy applicant, Could you please provide additional details on how I can begin working on this project, particularly setting it up locally on my machine? @spwoodcock

@spwoodcock spwoodcock changed the title Basemapper accept in memory GeoJSON for bbox Basemapper allow for in memory BytesIO GeoJSON for bbox param Mar 5, 2024
@spwoodcock spwoodcock changed the title Basemapper allow for in memory BytesIO GeoJSON for bbox param Basemapper allow for in memory BytesIO GeoJSON for boundary param Mar 5, 2024
@spwoodcock
Copy link
Member Author

@ohthebrave I updated the description to include a lot more instructions, including how to set up the repo and run the code.

Hope this helps!

@robsavoye
Copy link
Collaborator

Good idea to use BytesIO for this, I prefer to use memory as much as possible instead of disk files. Course we still need the disk file, but it can be produced from what's in memory. Before FMTM, basemapper was only used standalone in a terminal, so the file was fine. But using memory is better when part of a backend API.

@valentina-buoro
Copy link

Thanks for the instructions on setting up the repo and running the code @spwoodcock

@donaldte
Copy link

donaldte commented Mar 6, 2024

Thank you for providing guidance on configuring the repository and executing the code. @spwoodcock

@donaldte
Copy link

donaldte commented Mar 7, 2024

Hello mentor, @spwoodcock

I trust this message finds you well. I'm currently contributing to the osm-fieldwork project. Unfortunately, I've encountered an issue with joining the Slack community associated with this project.

As mentioned in the guidelines, I have made a contribution to my fork repository, and you can find the link to my pull request here: donaldte#1

Thank you for your attention to this matter.

Best regards,
Donald Tedom

@amandaguan-ag
Copy link

For someone who may encounter Install psycopg2 2.9.9 failed during pdm install -G test, this brew install postgresql command helped me out

@valentina-buoro
Copy link

Hello @spwoodcock , I have sent the link to my PR for this task, looking forward to your feedback. Thank you

@spwoodcock
Copy link
Member Author

For someone who may encounter Install psycopg2 2.9.9 failed during pdm install -G test, this brew install postgresql command helped me out

The underlying package this installs to fix the issue is libpq, available on MacOS via brew, and most Linux distributions as libpq-dev

@sahana-9314
Copy link

Hello @spwoodcock,

I was getting confused with the output file while running the code.
I encountered this after I ran the code:

MainThread - main - INFO - Downloading 1 tiles in thread 15308 to C:\Users\Sahana K\Desktop\Outreachy\osm-fieldwork\esritiles
Getting file from: http://services.arcgisonline.com/arcgis/rest/services/World_Imagery/MapServer/tile/15/12206/15953.jpg
[*] 26 kB / 26 kB @ 0 bytes/s [##################] [100%, 0s left]
MainThread - main - INFO - No outfile specified, tile download finished: C:\Users\Sahana K\Desktop\Outreachy\osm-fieldwork\esritiles

The command that I used to run the basemapper.py was:
pdm run python osm_fieldwork/basemapper.py -b -4.730494 41.650541 -4.725634 41.652874 -z 12-15 -s esri

I would really appreciate if you would take some time and help me with this as I am new to open source.

Thank you so much for your time on this matter.

@spwoodcock
Copy link
Member Author

If you can see the tile images in the esritiles directory, then the script was successful.

You can bundle the images to an mbtiles file, but it's not mandatory. Use the -o flag if you want to do that.

@siva224513
Copy link

I would like to contribute to this project

@maha-sachin
Copy link

@spwoodcock, I'm an Outreachy intern interested in contributing to this project. After reviewing the repository, I noticed there aren't any good first issues available. Could you please guide me on where to start my contribution?

@Remi-dee
Copy link

@spwoodcock, I'm an Outreachy intern interested in contributing to this project. After reviewing the repository, I noticed there aren't any good first issues available. Could you please guide me on where to start my contribution?

Hello, if you're an outreachy applicant, you could work on the task above, as it is a general issue open to all applicants.
All applicants are advised to work on this task and all instructions regarding this task have been provided by the mentor, @spwoodcock in the issue's description;

#204

@medinasheriff
Copy link

medinasheriff commented Mar 14, 2024

Hello @spwoodcock, I'm an Outreachy applicant interested in contributing to this project. I have done most of the required work, but haven't created a pull request yet. I've been trying to join the slack channel using my gmail, but I keep getting an error message that my email doesn't match the domain. How do I fix this? Or is it ok to submit the task without joining the slack channel?

***Update: I got a new link that worked. Thank you @petya-kangalova

@endurijahnavi
Copy link

Hey @spwoodcock, I'm an outreachy applicant and I wanted to thank you for your detailed description of the task. It has been incredibly helpful in navigating the project.
On a side note, like few others I've also been encountering difficulties joining the Slack community using my Gmail account. I was wondering if you have any suggestions or if there's a workaround for this issue.
Thank you.
Jahnavi

@petya-kangalova
Copy link
Contributor

@endurijahnavi thank you for the nice feedback. Apologies for the issues with joining Slack- I have now sent an invite request to your email. If any issues, please drop me an email to [email protected]

@TMicha
Copy link

TMicha commented Mar 28, 2024

I will revert with a PR link shortly

@medinasheriff
Copy link

@spwoodcock I sent my PR link to your slack inbox as instructed, almost 2 weeks ago, and I haven't gotten a feedback. Here it is again, just in case medinasheriff#1

@sahana-9314
Copy link

Please look into this @spwoodcock
https://github.com/sahana-9314/osm-fieldwork/pull/2

sahana-9314 pushed a commit to sahana-9314/osm-fieldwork that referenced this issue Apr 11, 2024
@sahana-9314
Copy link

Sorry for that problem before. Can you please check the updated one.
sahana-9314#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment