-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTC-3081: Add political/id-lookup endpoint #616
Changes from 1 commit
b94f3fa
505cc16
0514d42
9d2ae13
44a253f
9c2e0ca
899e772
d633596
7f5e9f3
2f2facd
15de81a
40f7772
2cee550
e07c4f4
bb69f18
1b95ca2
79ae7c3
09e628e
2934fd6
e09cf01
2d979a2
0c6d541
ed5f2cd
34c41f8
1952fc0
c6384fd
4aef63c
0f80b9e
b23bf7f
275ff6e
53ddba5
3bc92f5
fbacd70
9c5fc87
849d68a
4668790
68f6590
d35bb02
0bdb3e5
a9cf2df
83aa0a0
0effc5b
84bf869
af3c244
498ecc6
d9e0cda
55e1a2f
ffe9b6b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,12 @@ | ||
import re | ||
from typing import Optional, Any, Dict, List | ||
|
||
from fastapi import APIRouter, HTTPException, Query | ||
|
||
from app.crud.versions import get_version, get_version_names | ||
from app.errors import RecordNotFoundError | ||
from app.models.pydantic.responses import Response | ||
from app.routes import VERSION_REGEX | ||
from app.routes.datasets.queries import _query_dataset_json | ||
|
||
|
||
|
@@ -20,7 +25,7 @@ async def geoencode( | |
description="The source of administrative boundaries to use." | ||
), | ||
admin_version: str = Query( | ||
None, | ||
..., | ||
description="Version of the administrative boundaries dataset to use.", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar, is there a way to document the choices available? I guess this may get more confusing if ever have multiple providers There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wild idea: should we consolidate admin version and admin dataset to one field, and have the options be like: "GADM 3.6", "GADM 4.1", "geoBoundaries 1.0", "middleEarth 3.2". Then it'll be clear what you're getting from a set of options. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Combining the provider and version has merit ( However, we'd be encoding a string with special information (i.e., provider That's my two cents. |
||
), | ||
country: str = Query( | ||
|
@@ -46,13 +51,15 @@ async def geoencode( | |
dataset = admin_source_to_dataset[admin_source.upper()] | ||
except KeyError: | ||
raise HTTPException( | ||
status_code=404, | ||
status_code=400, | ||
detail=f"Invalid admin boundary source. Valid sources: {admin_source_to_dataset.keys()}" | ||
) | ||
|
||
version_str = "v" + str(admin_version).lstrip("v") | ||
|
||
sql: str = await admin_boundary_lookup_sql( | ||
await version_is_valid(dataset, version_str) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like mentioned about the documentation above, should it be well known in advance which versions we support for providers, rather than just trying and throwing back an error? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I fundamentally agree that your suggestion is a good one, but it turns out to be difficult to do in practice. Especially considering the limited use of this endpoint and the fact that we will be omitting it from the docs. |
||
|
||
sql: str = _admin_boundary_lookup_sql( | ||
admin_source, | ||
country, | ||
region, | ||
|
@@ -63,14 +70,16 @@ async def geoencode( | |
dataset, version_str, sql, None | ||
) | ||
|
||
return { | ||
"adminSource": admin_source, | ||
"adminVersion": admin_version, | ||
"matches": json_data | ||
} | ||
return Response( | ||
data={ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might be cleaner as a pydantic model instead of a dict There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Made a model, and a helper function to construct the response in hopefully a clear way. |
||
"adminSource": admin_source, | ||
"adminVersion": admin_version, | ||
"matches": json_data | ||
} | ||
) | ||
|
||
|
||
async def admin_boundary_lookup_sql( | ||
def _admin_boundary_lookup_sql( | ||
dataset: str, | ||
country_name: str, | ||
region_name: Optional[str], | ||
|
@@ -81,11 +90,40 @@ async def admin_boundary_lookup_sql( | |
""" | ||
sql = ( | ||
f"SELECT gid_0, gid_1, gid_2, country, name_1, name_2 FROM {dataset}" | ||
f" AND WHERE country='{country_name}'" | ||
f" WHERE country='{country_name}'" | ||
) | ||
if region_name is not None: | ||
sql += f" AND WHERE region='{region_name}'" | ||
if subregion_name is not None: | ||
sql += f" AND WHERE subregion='{subregion_name}'" | ||
|
||
return sql | ||
|
||
|
||
async def version_is_valid( | ||
dataset: str, | ||
version: str, | ||
) -> None: | ||
""" | ||
|
||
""" | ||
if re.match(VERSION_REGEX, version) is None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. per above, maybe just have pre-validated versions rather than checking the version There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now specified on a per-env basis! |
||
raise HTTPException( | ||
status_code=400, | ||
detail=( | ||
"Invalid version name. Version names begin with a 'v' and " | ||
"consist of one to three integers separated by periods. " | ||
"eg. 'v1', 'v7.1', 'v4.1.0', 'v20240801'" | ||
) | ||
) | ||
|
||
try: | ||
_ = await get_version(dataset, version) | ||
except RecordNotFoundError: | ||
raise HTTPException( | ||
status_code=400, | ||
detail=( | ||
"Version not found. Existing versions for this dataset " | ||
f"include {await get_version_names(dataset)}" | ||
) | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,28 +1,104 @@ | ||
# import pytest | ||
# from httpx import AsyncClient | ||
# | ||
# | ||
# @pytest.mark.asyncio | ||
# async def test_geoencoder_no_version(async_client: AsyncClient) -> None: | ||
# params = {"country": "Canada"} | ||
# | ||
# resp = await async_client.get("/thematic/geoencode", params=params) | ||
# | ||
# assert resp.status_code == 400 | ||
# | ||
# | ||
# @pytest.mark.asyncio | ||
# async def test_geoencoder_fake_country_no_matches(async_client: AsyncClient) -> None: | ||
# | ||
# params = {"admin_version": "4.1", "country": "Canadiastan"} | ||
# | ||
# resp = await async_client.get("/thematic/geoencode", params=params) | ||
# | ||
# assert resp.json() == { | ||
# "status": "success", | ||
# "data": { | ||
# "adminVersion": "4.1", | ||
# "matches": [] | ||
# } | ||
# } | ||
# assert resp.status_code == 200 | ||
from typing import Optional, Any, Dict, List | ||
|
||
import pytest | ||
from httpx import AsyncClient | ||
|
||
from app.models.pydantic.geostore import GeostoreCommon | ||
from app.routes.thematic import geoencoder | ||
from app.routes.thematic.geoencoder import _admin_boundary_lookup_sql | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test__admin_boundary_lookup_sql() -> None: | ||
sql = _admin_boundary_lookup_sql( | ||
"some_dataset", "some_country", "some_region", "some_subregion" | ||
) | ||
assert sql == ( | ||
"SELECT gid_0, gid_1, gid_2, country, name_1, name_2 FROM some_dataset " | ||
"WHERE country='some_country' " | ||
"AND WHERE region='some_region' " | ||
"AND WHERE subregion='some_subregion'" | ||
) | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_geoencoder_no_admin_version(async_client: AsyncClient) -> None: | ||
params = {"country": "Canada"} | ||
|
||
resp = await async_client.get("/thematic/geoencode", params=params) | ||
|
||
assert resp.status_code == 422 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_geoencoder_invalid_version_pattern(async_client: AsyncClient) -> None: | ||
params = {"country": "Canada", "admin_version": "fails_regex"} | ||
|
||
resp = await async_client.get("/thematic/geoencode", params=params) | ||
|
||
assert resp.json().get("message", {}).startswith("Invalid version") | ||
assert resp.status_code == 400 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_geoencoder_nonexistant_version(async_client: AsyncClient) -> None: | ||
params = {"country": "Canada", "admin_version": "v4.0"} | ||
|
||
resp = await async_client.get("/thematic/geoencode", params=params) | ||
|
||
assert resp.json().get("message", {}).startswith("Version not found") | ||
assert resp.status_code == 400 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_geoencoder_bad_boundary_source(async_client: AsyncClient) -> None: | ||
params = { | ||
"admin_source": "bobs_boundaries", | ||
"admin_version": "4.1", | ||
"country": "Canadiastan" | ||
} | ||
|
||
resp = await async_client.get("/thematic/geoencode", params=params) | ||
|
||
assert resp.json().get("message", {}).startswith("Invalid admin boundary source") | ||
assert resp.status_code == 400 | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_geoencoder_no_matches( | ||
async_client: AsyncClient, | ||
monkeypatch: pytest.MonkeyPatch | ||
) -> None: | ||
admin_source = "gadm" | ||
admin_version = "v4.1" | ||
|
||
params = { | ||
"admin_source": admin_source, | ||
"admin_version": admin_version, | ||
"country": "Canadiastan" | ||
} | ||
|
||
async def mock_version_is_valid(dataset: str, version: str): return None | ||
monkeypatch.setattr(geoencoder, "version_is_valid", mock_version_is_valid) | ||
monkeypatch.setattr(geoencoder, "_query_dataset_json", _query_dataset_json_mocked_results) | ||
|
||
resp = await async_client.get("/thematic/geoencode", params=params) | ||
|
||
assert resp.json() == { | ||
"status": "success", | ||
"data": { | ||
"adminSource": admin_source, | ||
"adminVersion": admin_version, | ||
"matches": [] | ||
} | ||
} | ||
assert resp.status_code == 200 | ||
|
||
|
||
async def _query_dataset_json_mocked_results( | ||
dataset: str, | ||
version: str, | ||
sql: str, | ||
geostore: Optional[GeostoreCommon], | ||
) -> List[Dict[str, Any]]: | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmannarino I know you said you didn't like thematic. I was thinking of other ideas, is this admin area specific endpoints, maybe we just put it under like something like "political"? E.g.
/political/geoencoder
. Then the future can include additional GADM endpoints, WDPA, concessions, etc. Not sure if we need to distinguish it from the rest of the API, or we can just have it all in under a header in the docs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I do think
geoencoder
is a little vague if it only works for admin areas, geoencoding implies converting any text of the place into coordinates: https://en.wikipedia.org/wiki/Address_geocodingThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to /political/id-lookup