Skip to content

Commit

Permalink
🥚 Allow using XpySTACAssetReader without xpystac when engine!=stac (#100
Browse files Browse the repository at this point in the history
)

* 🥚 Allow using XpySTACAssetReader without xpystac when engine!=stac

A little hidden feature to use `read_from_xpystac` with other engines (e.g. `netcdf4`, `h5netcdf`) without having to install `xpystac`! Added a unit test using `engine="rasterio"` which is technically deprecated, but works without having to install extra dependencies.

* ✏️ Remove mention of the STAC_URL environment variable

The STAC_URL environment variable never actually worked, see stac-utils/pystac-client#317. Also updated some links on the walkthrough to point to torchdata 0.6.1.
  • Loading branch information
weiji14 authored May 17, 2023
1 parent aa71331 commit 71ab786
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 8 deletions.
4 changes: 2 additions & 2 deletions docs/walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,14 +84,14 @@ This is how the Sentinel-2 image looks like over Singapore on 15 Jan 2022.

![Sentinel-2 image over Singapore on 20220115](https://planetarycomputer.microsoft.com/api/data/v1/item/preview.png?collection=sentinel-2-l2a&item=S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435&assets=visual&asset_bidx=visual%7C1%2C2%2C3&nodata=0)

## 1️⃣ Construct [DataPipe](https://github.com/pytorch/data/tree/v0.4.0#what-are-datapipes) 📡
## 1️⃣ Construct [DataPipe](https://github.com/pytorch/data/tree/v0.6.1#what-are-datapipes) 📡

A torch `DataPipe` is a way of composing data (rather than inheriting data).
Yes, I don't know what it really means either, so here's some extra reading.

🔖 References:
- https://pytorch.org/blog/pytorch-1.11-released/#introducing-torchdata
- https://github.com/pytorch/data/tree/v0.4.0#what-are-datapipes
- https://github.com/pytorch/data/tree/v0.6.1#what-are-datapipes
- https://realpython.com/inheritance-composition-python

### Create an Iterable 📏
Expand Down
3 changes: 1 addition & 2 deletions zen3geo/datapipes/pystac_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,7 @@ class PySTACAPISearcherIterDataPipe(IterDataPipe):
provided Collections will be searched.
catalog_url : str
The URL of a STAC Catalog. If not specified, this will use the
``STAC_URL`` environment variable.
The URL of a STAC Catalog.
kwargs : Optional
Extra keyword arguments to pass to
Expand Down
2 changes: 1 addition & 1 deletion zen3geo/datapipes/xpystac.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ def __init__(
engine: str = "stac",
**kwargs: Optional[Dict[str, Any]]
) -> None:
if xpystac is None:
if xpystac is None and engine == "stac":
raise ModuleNotFoundError(
"Package `xpystac` is required to be installed to use this datapipe. "
"Please use `pip install xpystac` "
Expand Down
41 changes: 38 additions & 3 deletions zen3geo/tests/test_datapipes_xpystac.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,16 @@

from zen3geo.datapipes import XpySTACAssetReader

pystac = pytest.importorskip("pystac")
xpystac = pytest.importorskip("xpystac")


# %%
def test_xpystac_asset_reader_cog():
"""
Ensure that XpySTACAssetReader works to read in a pystac.Asset object
stored as a Cloud-Optimized GeoTIFF and output to an xarray.Dataset object.
"""
pystac = pytest.importorskip("pystac")
xpystac = pytest.importorskip("xpystac")

item_url: str = "https://github.com/stac-utils/pystac/raw/v1.7.1/tests/data-files/raster/raster-sentinel2-example.json"
asset: pystac.Asset = pystac.Item.from_file(href=item_url).assets["overview"]
assert asset.media_type == pystac.MediaType.COG
Expand Down Expand Up @@ -43,6 +43,9 @@ def test_xpystac_asset_reader_zarr():
Ensure that XpySTACAssetReader works to read in a pystac.Asset object
stored as a Zarr file and output to an xarray.Dataset object.
"""
pystac = pytest.importorskip("pystac")
xpystac = pytest.importorskip("xpystac")

collection_url: str = "https://planetarycomputer.microsoft.com/api/stac/v1/collections/daymet-daily-hi"
asset: pystac.Asset = pystac.Collection.from_file(href=collection_url).assets[
"zarr-https"
Expand All @@ -65,3 +68,35 @@ def test_xpystac_asset_reader_zarr():
assert dataset.rio.bounds() == (-5802750.0, -622500.0, -5518750.0, -38500.0)
assert dataset.rio.resolution() == (1000.0, -1000.0)
assert dataset.rio.grid_mapping == "lambert_conformal_conic"


def test_xpystac_asset_reader_geotiff_without_xpystac():
"""
Ensure that XpySTACAssetReader works to read in a GeoTIFF file and output
to an xarray.Dataset object, even when xpystac is not installed.
Note that `engine="rasterio"` has been removed in xarray v2023.04.0, see
https://github.com/pydata/xarray/pull/7671. So, this test will need to be
updated once we change to require an xarray verson greater than 2023.04.0.
Only included this test to check an alternative to `engine="stac"` that
did not require installing extra required dependencies like `netcdf4` or
`h5netcdf`.
"""
tif_url: str = "https://github.com/corteva/rioxarray/raw/0.14.1/test/test_data/input/cint16.tif"

dp = IterableWrapper(iterable=[tif_url])

# Using class constructors
dp_xpystac = XpySTACAssetReader(source_datapipe=dp, engine="rasterio")
# Using functional form (recommended)
dp_xpystac = dp.read_from_xpystac(engine="rasterio")

assert len(dp_xpystac) == 1
it = iter(dp_xpystac)
dataset = next(it)

assert dataset.sizes == {"band": 1, "x": 100, "y": 100}
assert dataset.band_data.dtype == "complex64"
assert dataset.rio.bounds() == (0.0, 100.0, 100.0, 0.0)
assert dataset.rio.resolution() == (1.0, 1.0)
assert dataset.rio.crs == "EPSG:4326"

0 comments on commit 71ab786

Please sign in to comment.