-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new, safe, documented PrecomputedInfoSpec #881
Conversation
29f6069
to
b575873
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #881 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 143 142 -1
Lines 6130 6100 -30
=========================================
- Hits 6130 6100 -30 ☔ View full report in Codecov by Sentry. |
9d6f4af
to
f682b96
Compare
f682b96
to
6646edf
Compare
dad9455
to
7f3b538
Compare
The idea sounds good. But is it not possible to retain or query for existing scales? I wonder if it can be unsafe when you're overwriting info in-place. For example, upsampling [128, 128, 40] to [64, 64, 40]. In this case you can potentially lose existing scales (256, 512, ...)? |
@trivoldus28 forgot to mention this -- there are two more parameters to Existing scales in the created info path will be kept by default, but not inherited from the reference. |
Are the scales sorted when you add to existing scales? |
The previous version of precomputed-based layer creation relied on logic that would replicate the reference info file entirely, giving users various mechanisms to adjust it. This caused problems when the reference info file contained unexpected or new properties, such as new compression methods or sharding. Moreover, this method didn't provide a streamlined way to create an info file without a reference. The logic to apply fixes and modifications to the reference info file grew increasingly complex, hacky, and undocumented, making it difficult for non-experts to create robust specs.
We relied heavily on info inheritance logic because there are many cases where we want to inherit the reference info file almost entirely - specifically, for
copy
andinterpolate
operations. However, even in those cases, inheriting thesharding
property caused issues.This refactor aims to provide a way to specify info files that won't encounter problems due to unexpected info key/values while allowing a concise and robust method of performing full-inheritance operations such as
copy
. The set of info file controls exposed to the user has been reduced to:In the new system, only the
scales
parameter is required to construct a new info file. Inzutils
, we only modify the info file of layers being written to, and thescales
parameter represents the resolutions that will be written to the layer.There are two info inheritance modes, controlled by
inherit_all_params
. WhenFalse
,reference_path
is used only to inherit the dataset bounds. This is used for jobs such as network inference, where the output layer's info has no relation to the input layer's info. All parameters must be specified manually by the user.When
inherit_all_params
isTrue
, all info parameters will be inherited from thereference_path
, and users may only override thebbox
parameter. This is used when the output layer's info should be a copy of the input layer. Note that thesharding
key is not inherited from the scales.With this PR, all parameters have up-to-date documentation in
precomputed.py
and inbuild
for cloudvolume and tensorstore layers. Clear exceptions are thrown when users supply invalid parameters. The old versions ofbuild_layer
functions remain available by specifying"@version": "0.3"
in the spec.Tested with
Copy v2.0
andInterpolate v1.3
jobs in Portal.