feat: new, safe, documented PrecomputedInfoSpec #881

supersergiy · 2025-01-31T06:04:50Z

The previous version of precomputed-based layer creation relied on logic that would replicate the reference info file entirely, giving users various mechanisms to adjust it. This caused problems when the reference info file contained unexpected or new properties, such as new compression methods or sharding. Moreover, this method didn't provide a streamlined way to create an info file without a reference. The logic to apply fixes and modifications to the reference info file grew increasingly complex, hacky, and undocumented, making it difficult for non-experts to create robust specs.

We relied heavily on info inheritance logic because there are many cases where we want to inherit the reference info file almost entirely - specifically, for copy and interpolate operations. However, even in those cases, inheriting the sharding property caused issues.

This refactor aims to provide a way to specify info files that won't encounter problems due to unexpected info key/values while allowing a concise and robust method of performing full-inheritance operations such as copy. The set of info file controls exposed to the user has been reduced to:

        scales: Sequence[Sequence[float]],
        # Reference
        reference_path: str | None = None,
        inherit_all_params: bool = False,
        # Top-level info keys
        type: Literal["image", "segmentation"] | None = None,  
        data_type: PrecomputedVolumeDType | None = None,
        num_channels: int | None = None,
        # Per-scale info keys
        chunk_size: Sequence[int] | None = None,
        encoding: str | None = None,
        bbox: BBox3D | None = None,
        extra_scale_data: dict | None = None,

In the new system, only the scales parameter is required to construct a new info file. In zutils, we only modify the info file of layers being written to, and the scales parameter represents the resolutions that will be written to the layer.

There are two info inheritance modes, controlled by inherit_all_params. When False, reference_path is used only to inherit the dataset bounds. This is used for jobs such as network inference, where the output layer's info has no relation to the input layer's info. All parameters must be specified manually by the user.

When inherit_all_params is True, all info parameters will be inherited from the reference_path, and users may only override the bbox parameter. This is used when the output layer's info should be a copy of the input layer. Note that the sharding key is not inherited from the scales.

With this PR, all parameters have up-to-date documentation in precomputed.py and in build for cloudvolume and tensorstore layers. Clear exceptions are thrown when users supply invalid parameters. The old versions of build_layer functions remain available by specifying "@version": "0.3" in the spec.

Tested with Copy v2.0 and Interpolate v1.3 jobs in Portal.

codecov · 2025-02-02T03:21:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (303511d) to head (7f3b538).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #881   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          143       142    -1     
  Lines         6130      6100   -30     
=========================================
- Hits          6130      6100   -30

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

trivoldus28 · 2025-02-04T23:16:01Z

The idea sounds good. But is it not possible to retain or query for existing scales? I wonder if it can be unsafe when you're overwriting info in-place. For example, upsampling [128, 128, 40] to [64, 64, 40]. In this case you can potentially lose existing scales (256, 512, ...)?

supersergiy · 2025-02-05T16:51:13Z

@trivoldus28 forgot to mention this -- there are two more parameters to build_*: [info_overwrite and info_keep_existing_scales]

Existing scales in the created info path will be kept by default, but not inherited from the reference.

trivoldus28 · 2025-02-05T16:59:07Z

Are the scales sorted when you add to existing scales?

supersergiy force-pushed the sergiy/precomputed_no_inherit branch 2 times, most recently from 29f6069 to b575873 Compare February 2, 2025 03:06

supersergiy force-pushed the sergiy/precomputed_no_inherit branch 6 times, most recently from 9d6f4af to f682b96 Compare February 3, 2025 04:18

supersergiy added 3 commits February 3, 2025 04:20

feat: new, safe, documented PrecomputedInfoSpec

547c5f8

feat: updatedate to new precomputed usage

21e8751

feat: simplify precomputed scale extension

6646edf

supersergiy force-pushed the sergiy/precomputed_no_inherit branch from f682b96 to 6646edf Compare February 3, 2025 04:20

supersergiy requested review from JoeStrout, dodamih, trivoldus28 and nkemnitz February 3, 2025 04:20

supersergiy marked this pull request as draft February 3, 2025 05:43

JoeStrout approved these changes Feb 3, 2025

View reviewed changes

feat: better scale inheritence

7f3b538

supersergiy force-pushed the sergiy/precomputed_no_inherit branch from dad9455 to 7f3b538 Compare February 4, 2025 16:05

supersergiy marked this pull request as ready for review February 4, 2025 16:28

supersergiy merged commit 4bdcb62 into main Feb 4, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: new, safe, documented PrecomputedInfoSpec #881

feat: new, safe, documented PrecomputedInfoSpec #881

supersergiy commented Jan 31, 2025 •

edited

Loading

codecov bot commented Feb 2, 2025 •

edited

Loading

trivoldus28 commented Feb 4, 2025

supersergiy commented Feb 5, 2025 •

edited

Loading

trivoldus28 commented Feb 5, 2025

feat: new, safe, documented PrecomputedInfoSpec #881

feat: new, safe, documented PrecomputedInfoSpec #881

Conversation

supersergiy commented Jan 31, 2025 • edited Loading

codecov bot commented Feb 2, 2025 • edited Loading

Codecov Report

trivoldus28 commented Feb 4, 2025

supersergiy commented Feb 5, 2025 • edited Loading

trivoldus28 commented Feb 5, 2025

supersergiy commented Jan 31, 2025 •

edited

Loading

codecov bot commented Feb 2, 2025 •

edited

Loading

supersergiy commented Feb 5, 2025 •

edited

Loading