Skip to content

Commit

Permalink
Merge branch 'scverse:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
berombau authored Feb 28, 2024
2 parents eb3f35b + 6f71197 commit 4c150c5
Show file tree
Hide file tree
Showing 29 changed files with 1,557 additions and 722 deletions.
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ ci:
skip: []
repos:
- repo: https://github.com/psf/black
rev: 23.12.1
rev: 24.2.0
hooks:
- id: black
- repo: https://github.com/pre-commit/mirrors-prettier
Expand All @@ -27,7 +27,7 @@ repos:
additional_dependencies: [numpy, types-requests]
exclude: tests/|docs/
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.13
rev: v0.2.2
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
12 changes: 8 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,18 @@ and this project adheres to [Semantic Versioning][].

## [0.0.x] - tbd

### Minor

- improved usability and robustness of sdata.write() when overwrite=True @aeisenbarth

### Added

- added SpatialData.subset() API
- added SpatialData.locate_element() API
- added transform_to_data_extent()
- added utils function: are_extents_equal()
- added utils function: postpone_transformation()
- added utils function: remove_transformations_to_coordinate_system()

### Minor

- improved usability and robustness of sdata.write() when overwrite=True @aeisenbarth

### Fixed

Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The spatialdata project also received support by the Chan Zuckerberg Initiative.

![SpatialDataOverview](https://github.com/scverse/spatialdata/assets/1120672/cb91071f-12a7-4b8e-9430-2b3a0f65e52f)

- **The library is currently under review.** We expect there to be changes as the community provides feedback.
- **The library is currently under review.** We expect there to be changes as the community provides feedback. We have an announcement channel for communicating these changes, please see the contact section below.
- The SpatialData storage format is built on top of the [OME-NGFF](https://ngff.openmicroscopy.org/latest/) specification.

## Getting started
Expand Down Expand Up @@ -63,6 +63,10 @@ To get involved in the discussion, or if you need help to get started, you are w
- <ins>Bug report/feature request</ins> via the [GitHub issue tracker][issue-tracker].
- <ins>Zoom call</ins> as part of the SpatialData Community Meetings, held every 2 weeks on Thursday, [schedule here](https://hackmd.io/enWU826vRai-JYaL7TZaSw).

Finally, especially relevant for for developers that are building a library upon `spatialdata`, please follow this channel for:

- <ins>Announcements</ins> on new features and important changes [Zulip](https://imagesc.zulipchat.com/#narrow/stream/329057-scverse/topic/spatialdata.20announcements).

## Citation

[L Marconato*, G Palla*, KA Yamauchi*, I Virshup*, E Heidari, T Treis, M Toth, R Shrestha, H Vöhringer, W Huber, M Gerstung, J Moore, FJ Theis, O Stegle, bioRxiv, 2023](https://www.biorxiv.org/content/10.1101/2023.05.05.539647v1). \* = equal contribution
Expand Down
2 changes: 2 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ Operations on `SpatialData` objects.
:toctree: generated
unpad_raster
are_extents_equal
```

## Models
Expand Down Expand Up @@ -111,6 +112,7 @@ The transformations that can be defined between elements and coordinate systems
get_transformation_between_coordinate_systems
get_transformation_between_landmarks
align_elements_using_landmarks
remove_transformations_to_coordinate_system
```

## DataLoader
Expand Down
1 change: 1 addition & 0 deletions docs/design_doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -564,6 +564,7 @@ with coordinate systems:
with axes: c, y, x
with elements: /images/point8, /labels/point8
"""

sdata0 = sdata.query.coordinate_system("point23", filter_rows=False)
sdata1 = sdata.query.bounding_box((0, 20, 0, 300))
sdata1 = sdata.query.polygon("/polygons/annotations")
Expand Down
15 changes: 10 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,10 @@ exclude = [
"dist",
"setup.py",
]
line-length = 120
target-version = "py39"

[tool.ruff.lint]
ignore = [
# Do not assign a lambda expression, use a def -> lambda expression assignments are convenient
"E731",
Expand All @@ -154,7 +158,6 @@ ignore = [
# Missing docstring in magic method
"D105",
]
line-length = 120
select = [
"D", # flake8-docstrings
"I", # isort
Expand All @@ -175,8 +178,11 @@ select = [
"PGH", # pygrep-hooks
]
unfixable = ["B", "C4", "UP", "BLE", "T20", "RET"]
target-version = "py39"
[tool.ruff.per-file-ignores]

[tool.ruff.lint.pydocstyle]
convention = "numpy"

[tool.ruff.lint.per-file-ignores]
"tests/*" = ["D", "PT", "B024"]
"*/__init__.py" = ["F401", "D104", "D107", "E402"]
"docs/*" = ["D","B","E","A"]
Expand All @@ -188,8 +194,7 @@ target-version = "py39"
"src/spatialdata/dataloader/datasets.py" = ["D101"]
"tests/test_models/test_models.py" = ["NPY002"]
"tests/conftest.py"= ["E402"]
[tool.ruff.pydocstyle]
convention = "numpy"


# pyupgrade typing rewrite TODO: remove at some point from per-file ignore
# "UP006", "UP007"
3 changes: 2 additions & 1 deletion src/spatialdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,12 @@
"unpad_raster",
"save_transformations",
"get_dask_backing_files",
"are_extents_equal",
]

from spatialdata import dataloader, models, transformations
from spatialdata._core.concatenate import concatenate
from spatialdata._core.data_extent import get_extent
from spatialdata._core.data_extent import are_extents_equal, get_extent
from spatialdata._core.operations.aggregate import aggregate
from spatialdata._core.operations.rasterize import rasterize
from spatialdata._core.operations.transform import transform
Expand Down
1 change: 1 addition & 0 deletions src/spatialdata/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
various operations through a terminal. Currently, it implements the "peek" function, which allows users to inspect
the contents of a SpatialData .zarr dataset. Additional CLI functionalities will be implemented in the future.
"""

from typing import Literal

import click
Expand Down
1 change: 1 addition & 0 deletions src/spatialdata/_core/_elements.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""SpatialData elements."""

from __future__ import annotations

from collections import UserDict
Expand Down
46 changes: 32 additions & 14 deletions src/spatialdata/_core/data_extent.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,6 @@
from spatialdata.models._utils import SpatialElement
from spatialdata.models.models import PointsModel
from spatialdata.transformations.operations import get_transformation
from spatialdata.transformations.transformations import (
BaseTransformation,
)

BoundingBoxDescription = dict[str, tuple[float, float]]

Expand Down Expand Up @@ -174,7 +171,7 @@ def get_extent(
The exact extent is the bounding box `[minx, miny, maxx, maxy] = [0, 0, 0, 1.414]`, while the approximate extent is
the box `[minx, miny, maxx, maxy] = [-0.707, 0, 0.707, 1.414]`.
"""
raise ValueError("The object type is not supported.")
raise ValueError(f"The object type {type(e)} is not supported.")


@get_extent.register
Expand Down Expand Up @@ -289,9 +286,7 @@ def _(e: GeoDataFrame, coordinate_system: str = "global", exact: bool = True) ->
coordinate_system=coordinate_system,
extent=extent,
)
t = get_transformation(e, to_coordinate_system=coordinate_system)
assert isinstance(t, BaseTransformation)
transformed = transform(e, t)
transformed = transform(e, to_coordinate_system=coordinate_system)
return _get_extent_of_shapes(transformed)


Expand All @@ -305,9 +300,7 @@ def _(e: DaskDataFrame, coordinate_system: str = "global", exact: bool = True) -
coordinate_system=coordinate_system,
extent=extent,
)
t = get_transformation(e, to_coordinate_system=coordinate_system)
assert isinstance(t, BaseTransformation)
transformed = transform(e, t)
transformed = transform(e, to_coordinate_system=coordinate_system)
return _get_extent_of_points(transformed)


Expand Down Expand Up @@ -353,8 +346,6 @@ def _compute_extent_in_coordinate_system(
-------
The bounding box description in the specified coordinate system.
"""
transformation = get_transformation(element, to_coordinate_system=coordinate_system)
assert isinstance(transformation, BaseTransformation)
from spatialdata._core.query._utils import get_bounding_box_corners

axes = get_axes_names(element)
Expand All @@ -368,10 +359,37 @@ def _compute_extent_in_coordinate_system(
max_coordinate=max_coordinates,
)
df = pd.DataFrame(corners.data, columns=corners.axis.data.tolist())
points = PointsModel.parse(df, coordinates={k: k for k in axes})
transformed_corners = pd.DataFrame(transform(points, transformation).compute())
d = get_transformation(element, get_all=True)
points = PointsModel.parse(df, coordinates={k: k for k in axes}, transformations=d)
transformed_corners = pd.DataFrame(transform(points, to_coordinate_system=coordinate_system).compute())
# Make sure min and max values are in the same order as axes
extent = {}
for ax in axes:
extent[ax] = (transformed_corners[ax].min(), transformed_corners[ax].max())
return extent


def are_extents_equal(extent0: BoundingBoxDescription, extent1: BoundingBoxDescription, atol: float = 0.1) -> bool:
"""
Check if two data extents, as returned by `get_extent()` are equal up to approximation errors.
Parameters
----------
extent0
The first data extent.
extent1
The second data extent.
atol
The absolute tolerance to use when comparing the extents.
Returns
-------
Whether the extents are equal or not.
Notes
-----
The default value of `atol` is currently high because of a bug of `rasterize()` that makes the extent of the
rasterized data slightly different from the extent of the original data. This bug is tracked in
https://github.com/scverse/spatialdata/issues/165
"""
return all(np.allclose(extent0[k], extent1[k], atol=atol) for k in set(extent0.keys()).union(extent1.keys()))
136 changes: 136 additions & 0 deletions src/spatialdata/_core/operations/_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
from __future__ import annotations

from typing import TYPE_CHECKING

from multiscale_spatial_image import MultiscaleSpatialImage
from spatial_image import SpatialImage

if TYPE_CHECKING:
from spatialdata._core.spatialdata import SpatialData


def transform_to_data_extent(
sdata: SpatialData,
coordinate_system: str,
maintain_positioning: bool = True,
target_unit_to_pixels: float | None = None,
target_width: float | None = None,
target_height: float | None = None,
target_depth: float | None = None,
) -> SpatialData:
"""
Transform the spatial data to match the data extent, so that pixels and vector coordinates correspond.
Given a selected coordinate system, this function will transform the spatial data in that coordinate system, and
will resample images, so that the pixels and vector coordinates correspond.
In other words, the vector coordinate (x, y) (or (x, y, z)) will correspond to the pixel (y, x) (or (z, y, x)).
When `maintain_positioning` is `False`, each transformation will be set to Identity. When `maintain_positioning` is
`True` (default value), each element of the data will also have a transformation that will maintain the positioning
of the element, as it was before calling this function.
Note that in this case the correspondence between pixels and vector coordinates is true in the intrinsic coordinate
system, not in the target coordinate system.
Parameters
----------
sdata
The spatial data to transform.
coordinate_system
The coordinate system to use to compute the extent and to transform the data to.
maintain_positioning
If `True`, the transformation will maintain the positioning of the elements, as it was before calling this
function. If `False`, each transformation will be set to Identity.
target_unit_to_pixels
The required number of pixels per unit (units in the target coordinate system) of the data that will be
produced.
target_width
The width of the data extent, in pixels, for the data that will be produced.
target_height
The height of the data extent, in pixels, for the data that will be produced.
target_depth
The depth of the data extent, in pixels, for the data that will be produced.
Returns
-------
SpatialData
The transformed spatial data with downscaled and padded images and adjusted vector coordinates; all the
transformations will set to Identity and the coordinates of the vector data will be aligned to the pixel
coordinates.
Notes
-----
- The data extent is the smallest rectangle that contains all the images and geometries.
- MultiscaleSpatialImage objects will be converted to SpatialImage objects.
- This helper function will be deprecated when https://github.com/scverse/spatialdata/issues/308 is closed,
as this function will be easily recovered by `transform_to_coordinate_system()`
"""
from spatialdata._core.data_extent import get_extent
from spatialdata._core.operations.rasterize import _compute_target_dimensions, rasterize
from spatialdata._core.spatialdata import SpatialData
from spatialdata.transformations.operations import get_transformation, set_transformation
from spatialdata.transformations.transformations import BaseTransformation, Identity, Scale, Sequence, Translation

sdata = sdata.filter_by_coordinate_system(coordinate_system=coordinate_system)
# calling transform_to_coordinate_system will likely decrease the resolution, let's use rasterize() instead
sdata_vector = SpatialData(shapes=dict(sdata.shapes), points=dict(sdata.points))
sdata_raster = SpatialData(images=dict(sdata.images), labels=dict(sdata.labels))
sdata_vector_transformed = sdata_vector.transform_to_coordinate_system(coordinate_system)

data_extent = get_extent(sdata, coordinate_system=coordinate_system)
data_extent_axes = tuple(data_extent.keys())
translation_to_origin = Translation([-data_extent[ax][0] for ax in data_extent_axes], axes=data_extent_axes)

sizes = [data_extent[ax][1] - data_extent[ax][0] for ax in data_extent_axes]
target_width, target_height, target_depth = _compute_target_dimensions(
spatial_axes=data_extent_axes,
min_coordinate=[0 for _ in data_extent_axes],
max_coordinate=sizes,
target_unit_to_pixels=target_unit_to_pixels,
target_width=target_width,
target_height=target_height,
target_depth=target_depth,
)
scale_to_target_d = {
"x": target_width / sizes[data_extent_axes.index("x")],
"y": target_height / sizes[data_extent_axes.index("y")],
}
if target_depth is not None:
scale_to_target_d["z"] = target_depth / sizes[data_extent_axes.index("z")]
scale_to_target = Scale([scale_to_target_d[ax] for ax in data_extent_axes], axes=data_extent_axes)

for el in sdata_vector_transformed._gen_elements_values():
t = get_transformation(el, to_coordinate_system=coordinate_system)
assert isinstance(t, BaseTransformation)
sequence = Sequence([t, translation_to_origin, scale_to_target])
set_transformation(el, transformation=sequence, to_coordinate_system=coordinate_system)
sdata_vector_transformed_inplace = sdata_vector_transformed.transform_to_coordinate_system(
coordinate_system, maintain_positioning=True
)

sdata_to_return_elements = {
**sdata_vector_transformed_inplace.shapes,
**sdata_vector_transformed_inplace.points,
}

for _, element_name, element in sdata_raster._gen_elements():
if isinstance(element, (MultiscaleSpatialImage, SpatialImage)):
rasterized = rasterize(
element,
axes=data_extent_axes,
min_coordinate=[data_extent[ax][0] for ax in data_extent_axes],
max_coordinate=[data_extent[ax][1] for ax in data_extent_axes],
target_coordinate_system=coordinate_system,
target_unit_to_pixels=None,
target_width=target_width,
target_height=None,
target_depth=None,
)
sdata_to_return_elements[element_name] = rasterized
else:
sdata_to_return_elements[element_name] = element
if sdata.table is not None:
sdata_to_return_elements["table"] = sdata.table
if not maintain_positioning:
for el in sdata_to_return_elements.values():
set_transformation(el, transformation={coordinate_system: Identity()}, set_all=True)
return SpatialData.from_elements_dict(sdata_to_return_elements)
4 changes: 2 additions & 2 deletions src/spatialdata/_core/operations/aggregate.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,8 +170,8 @@ def aggregate(
target_coordinate_system, # type: ignore[assignment]
)
if not (by_transform == values_transform and isinstance(values_transform, Identity)):
by_ = transform(by_, by_transform)
values_ = transform(values_, values_transform)
by_ = transform(by_, to_coordinate_system=target_coordinate_system)
values_ = transform(values_, to_coordinate_system=target_coordinate_system)

# dispatch
adata = None
Expand Down
Loading

0 comments on commit 4c150c5

Please sign in to comment.