Lazy resampling executes non-lazy transforms without executing pending transforms #7370

atbenmurray · 2023-04-28T18:42:29Z

atbenmurray
Apr 28, 2023
Collaborator

Describe the bug
Lazy resampling will execute the following pipeline incorrectly:

d = torch.zeros((1, 40, 40, 40))

xform = mt.Compose(
    [
        mt.Flip(spatial_axis=0),
        mt.Rotate90(k=1),
        mt.Rand2DElastic(3, (5, 10)),
        mt.Zoom(zoom=0.8),
        mt.Spacing(pixdim=1.2),
    ],
    lazy_evaluation=True,
    verbose=True
)
r = xform(d)

My understanding is that this was implemented deliberately to provide improved performance on some of the existing pipelines that are being benchmarked, but it will break a lot of our users pipelines if the non-lazy transforms rely on the pending transforms to have been run:
. Noise will undergo spatial distortion that was not intended
. Non-lazy spatial transforms will output the wrong result

While we want the best performance on our pipelines, we shouldn't expect our users to have to modify their pipelines to execute correctly.

The most obvious solution is a new flag for Compose. We've discussed it as a 1.3 feature but now we need it for this release.

class Compose(Randomizable, InvertibleTransform):
    def __init__(
        self,
        transforms: Sequence[Callable] | Callable | None = None,
        map_items: bool = True,
        unpack_items: bool = False,
        lazy: bool | None = False,
        options: ??? | None = None,
        overrides: dict | None = None,
        logger_name: str | None = None,

Options can be thought of like compile flags for a c++ compiler.
We should pick a way of specifying them that is extensible enough for our purpose and easy for the user:

"optimize: lazy_to_front, another_option: 3" # option string
["optimize: lazy_to_front", another_option: 3"] # option string array
{"optimize": "lazy_to_front", another_option: 3"} #json-like

*** Flags for 1.2 ***
We only need one or two flags for 1.2
. "lazy_to_front", "lazy_to_back"

These should physically reorder the transforms for execution. It can reshuffle the transforms in a way similar to RandomOrder.

@myron This should cover the benchmark performance that you need, yes?
@wyli This seems implementable in #6257 fairly simply. Given that RandomOrder works fine during inverse AFAIK, it means we also don't need to lose invert

wyli · 2023-04-28T19:26:01Z

wyli
Apr 28, 2023
Collaborator

Thanks and agree that adding an option will allow for different use cases, one of the use cases is to achieve fast but less precise results compared with the non-lazy...

{"optimize": "lazy_to_front", another_option: 3"} #json-like

Looks highly extensible

0 replies

myron · 2023-04-28T23:19:38Z

myron
Apr 28, 2023
Maintainer

hi @atbenmurray I don't understand the issue well

. Noise will undergo spatial distortion that was not intended
. Non-lazy spatial transforms will output the wrong result

which Noise? there is no intensity Noise transform here. or do you mean something else?

--
Also

 "lazy_to_front", "lazy_to_back"
These should physically reorder the transforms for execution.

Will it move all lazy transforms to the front? (then it won't be the right sequence of transforms ,) . If I understand it correctly - to move ALL lazy trasforms to front/back, then it will be applicable only to some special cases (and not in general), since we can't simply re-arrange transforms.

A common transform list I use is following (with Indentities for current "lazy_evaluation" version)

LoadImageD
CropForeground based on image
Indentity(image, label)
SpacingD (resample)
NormalizeIntensityd #not sure if I need Identity before
SpatialPadd (image, label)
Indentity(label)
RandCropByLabelClassesd (based on label classes)
RandomAffine
RandomFlip
Indentity(image)
RandomSmooth
RandomIntensityNoise

I don't think we can simply re-arrange lazy (spatial) transforms to the front.

I personally think we should do lazy transform explicit. Anyone who wants to use them must create a separate list of spatial transforms with Compose( spatial_transforms, lazy=True), and then use that new compose_transform as a step between main transforms

0 replies

atbenmurray · 2023-04-29T09:09:25Z

atbenmurray
Apr 29, 2023
Collaborator Author

@myron thanks for the detailed response! In general, people shouldn't have to use this compose option. The primary purpose for it is to work around the situation we have now for the upcoming release. As I understand it, we have benchmarked pipelines that rely on the dev branch behaviour.

The dev branch produces incorrect output in the general case of having lazy and non lazy transforms mixed together, because, unless you use Identity/Identityd (becoming ApplyPending/ApplyPendingd) to enforce ordering, all non-lazy transforms will be executed first.

This decision was taken to improve performance on a benchmark that had a Lambda transform in it (as far as I am aware from the other conversations), but it will break our user's pipelines in the general case. move_lazy_first can't work for the above pipeline but move_lazy_last might.

I'm doing some experiments with your pipeline so that I fully understand the implications

@wyli I guess I really need to get a complete list of the pipelines this behaviour was done for so we can be sure that this fix would sort those pipelines for benchmarking while allowing default behaviour to not break our users' pipelines

0 replies

wyli · 2023-04-29T09:19:27Z

wyli
Apr 29, 2023
Collaborator

I think we just use the option as you suggested to allow for the flexibility in general, the user can choose what to use for their own pipelines. it'll be impossible to have an exhaustive list of use cases.

by the way I disagree with "The dev branch behaviour has a major issue in that unless you use Identity/Identityd (becoming ApplyPending/ApplyPendingd) to enforce ordering", could you rephrase to avoid misleading info?

0 replies

atbenmurray · 2023-04-29T09:20:12Z

atbenmurray
Apr 29, 2023
Collaborator Author

I personally think we should do lazy transform explicit. Anyone who wants to use them must create a separate list of spatial > transforms with Compose( spatial_transforms, lazy=True), and then use that new compose_transform as a step between main transforms

The goal has always been to allow it to work without breaking anything in the general case. That way it is accessible just as a switch that can be flipped at the policy level rather than having to rewrite pipelines. The user is always then free to get very fancy with how they organise their transforms to get the extra benefit, but move_lazy_last, for example, would have that effect.

0 replies

atbenmurray · 2023-04-29T09:24:32Z

atbenmurray
Apr 29, 2023
Collaborator Author

by the way I disagree with "The dev branch behaviour has a major issue in that unless you use Identity/Identityd (becoming ApplyPending/ApplyPendingd) to enforce ordering", could you rephrase to avoid misleading info?

I've modified it to say 'will produce incorrect output'.

0 replies

atbenmurray · 2023-04-29T09:25:20Z

atbenmurray
Apr 29, 2023
Collaborator Author

@wyli I really need to see the pipelines that the dev branch behaviour was intended for. Surely you can point me at them. The flag isn't a fix unless we know it fixes those specific cases.

0 replies

wyli · 2023-04-29T09:46:19Z

wyli
Apr 29, 2023
Collaborator

sure, currently the main cases that I'm aware of are:

0 replies

atbenmurray · 2023-04-29T09:50:05Z

atbenmurray
Apr 29, 2023
Collaborator Author

Great! So if we eyeball these and verify that moving all the lazy transforms last is the equivalent of dev branch behaviour, then this is the right modification for the PR and we should be able to avoid needing to re-benchmark

0 replies

wyli · 2023-04-29T09:58:57Z

wyli
Apr 29, 2023
Collaborator

I've modified it to say 'will produce incorrect output'.

I think I understand your point, but my point is that, there are 200+ transforms in monai, even if it's in non-lazy mode, you can easily compose certain sequences of some of them that may not work as you think (especially if you don't read the documentation)... we can't say that the monai codebase is incorrect or has major issue, right?

0 replies

wyli · 2023-04-29T10:10:45Z

wyli
Apr 29, 2023
Collaborator

also, we'll never achieve exactly the same result by design, like the padding modes, output data types (quantisations), and interpolation modes. In most use cases, there's a trade-off for speed/interpolation quality/memory footprint, in my opinion, we just need more documentation about the details and let the user decides what transform sequence to use.

0 replies

atbenmurray · 2023-04-29T10:20:45Z

atbenmurray
Apr 29, 2023
Collaborator Author

I've modified it to say 'will produce incorrect output'.

I think I understand your point, but my point is that, there are 200+ transforms in monai, even if it's in non-lazy mode, you can easily compose certain sequences of some of them that may not work as you think (especially if you don't read the documentation)... we can't say that the monai codebase is incorrect or has major issue, right?

That's why the default behaviour should be as has always been stated in the design and presentations and discussions, i.e. that the ordering of transform execution is unmodified, and that any changes to transform ordering should come from option flags.
That way, the majority of people can always just switch the lazy flag on and have the best chance of it working as expected.

Anyway, our plan is to add the option flag which should allow the benchmarked pipelines to run as discussed, and have the default behaviour be to not modify transform order. There won't be any need to document anything as an issue in docstrings or release notes, because it won't be an issue, just an option that allows you to move all the lazy transforms last if you want to do so.

draft of a docstring fragment for compose options:

option 'move_lazy_last': This option reorders the transforms in a given pipeline so that they are executed after all non-lazy transforms. This can be helpful for performance and image quality, as it results in fewer resampling operations. Note that this option should not be used with pipelines that require ordering to be preserved, such as if you have a mixed sequence of lazy and non-lazy spatial transforms that must be performed in the specified order. If your pipeline uses ApplyPending transforms, they effectively split the list into sub-lists for the purposes of reordering and reordering only happens within the sub-list

0 replies

atbenmurray · 2023-04-29T15:46:58Z

atbenmurray
Apr 29, 2023
Collaborator Author

Ok, so after trying a couple of different interpretations of the dev lazy mode, I know what move_lazy_last means:

So:

'LX' means lazy transform with id X (id doesn't mean anything here, just helps with clarity in the examples)
'NY' means non-lazy transform with id Y
'AZ' means ApplyPending/ApplyPendingD with id Z

Given a list of transforms:

['LA', 'NB', 'LC', 'AD', 'LE', 'NF']

The presence of an ApplyPending splits the transform list into multiple sublists

['LA', 'NB', 'LC'], ['AD'], ['LE', 'NF']

Each sublist is then sorted so non-lazy transforms go first, lazy transforms go next, and then the ApplyPending goes last:

['NB', 'LA', 'LC'], ['AD'], ['NF', 'LE']

Which then concatenates back to:

['NB', 'LA', 'LC', 'AD', 'NF', 'LE']

This precisely mimics the behaviour in dev and makes sense from a user perspective; it is easy to explain that an ApplyPending effectively splits the list of transforms into sublists and schedules each separately.

I've implemented this and I am running a wide selection of tests on it.

0 replies

wyli · 2023-04-30T07:39:52Z

wyli
Apr 30, 2023
Collaborator

On the dev branch there is no reordering of the user-defined transforms with or without lazy resampling, you can turn on the verbose=True to confirm this.
So now I'm not sure why you need reordering to mimic the dev branch behaviour.

If you are talking about the 'applied_operations' #6371 #6439, these are about inversing of a compose using compose.inverse(data). The inverse is not supported and a warning message will be raised on the dev branch. Please see the test case for this use case

MONAI/tests/test_integration_lazy_samples.py

Line 159 in d29914d

assert inverted is None, "invert LambdaD + lazy is not supported"

It has nothing to do with the transform call, that is compose.__call__

0 replies

atbenmurray · 2023-04-30T09:03:45Z

atbenmurray
Apr 30, 2023
Collaborator Author

There is effective reordering on dev. Think about the interaction between pending transforms and transforms executed immediately.

Given the following pipeline on dev:

[Flip, Spacing, Rand3DElastic, Rotate]

Flip is executed lazily

data.pending_operations = [Flip]
executed = []

Spacing is executed lazily

data.pending_operations = [Flip, Spacing]
executed = []

Rand3DElastic is executed non-lazily, without the pending transforms being executed first. This means that Rand3DElastic is the first to change the actual data (non-lazy execution), ie out of order

data.pending_operations = [Flip, Spacing]
executed = [Rand3DElastic]

Rotate is executed lazily

data.pending_operations = [Flip, Spacing, Rotate]
executed = [Rand3DElastic]

Finally, the pending transforms are executed

data.pending_operations = []
executed = [Rand3DElastic, Flip, Spacing, Rotate]

Rand3DElastic is the first transform to be executed on the data; it should have been the 3rd

It doesn't matter that the logging shows that Rand3DElastic was visited in order. It is the first transform that is executed on the data. That is what I mean by effective reordering.

This should not be the default behaviour, because we want users to switch on lazy without it breaking their pipelines.

0 replies

wyli · 2023-04-30T10:25:44Z

wyli
Apr 30, 2023
Collaborator

It isn't ambiguous for cache vs non-cache. There are clear rules.

Agreed, that's what we have on the Dev branch, that's what we have for lazy resampling on the Dev branch.

0 replies

atbenmurray · 2023-04-30T10:30:01Z

atbenmurray
Apr 30, 2023
Collaborator Author

It is a bug that

[Flip, Spacing, Rand3DElastic, Rotate]

gets effectively executed as

[Rand3DElastic, Flip, Spacing, Rotate]

The default behaviour should be that no reordering occurs, because it breaks what the user has defined for their pipeline. Reordering must be opt in.

PR will allow original reordering, reordering, inversion of reordering, for both cache and non-cache.

0 replies

wyli · 2023-04-30T10:30:35Z

wyli
Apr 30, 2023
Collaborator

there are plenty of users who won't be using the caching / persistent mechanisms

I don't think that's a valid/good assumption for monai now and future

0 replies

atbenmurray · 2023-04-30T10:31:41Z

atbenmurray
Apr 30, 2023
Collaborator Author

there are plenty of users who won't be using the caching / persistent mechanisms

I don't think that's a valid/good assumption for monai now and future

I only point that out because you mentioned that rather than addressing the core point and in the previous paragraph I explained that the behaviour is clearly defined for caching / non-caching

I've already shown in previous discussions, presentations and so forth that we can integrate caching into the same mechanism that does things like specifying out of order performance optimisations.

0 replies

wyli · 2023-04-30T11:35:08Z

wyli
Apr 30, 2023
Collaborator

ok, in the interests of time for releasing v1.2, just to reiterate my understanding:

to me the changes that you propose are not fundamentally different from the current dev branch, however, they should be well-documented with docstrings and tutorials, in order to be released in v1.2. (The core reviewers decide whether the proposed changes are 'well-documented' or not)
the current dev branch's lazy resampling has been successfully evaluated with the major use cases that we have, and we observed good speedup with equivalent model training quality. The new change should be able to preserve the same performance. (the core reviewers decide whether the proposed changes are 'preserving the performance')
because of point 1 and 2, we don't describe the current deb branch's API as a major issue or bug in the discussions/documentation. The proposed changes are usability enhancements for the current dev branch.

If you don't agree with any of the above, let's have an offline discussion with the wider team...

0 replies

atbenmurray · 2023-04-30T11:53:46Z

atbenmurray
Apr 30, 2023
Collaborator Author

I'd phrase as the following:

The PR default behaviour is no reordering, as this allows users pipelines to work as intended in the general case
There is a option flag that moves lazy transforms after non lazy ones, constrained by the next apply pending (this should duplicate dev behaviour so we can avoid having to re-benchmark everything)
The release notes / documentation describes the effect of applying the lazy_to_last flag and the circumstances under which it should be used

But yes, in essence, it should be fine

0 replies

wyli · 2023-05-01T06:10:42Z

wyli
May 1, 2023
Collaborator

Sure, that's good. Implementation-wise, do you plan to include all the code changes in PR #6257 and what's the timeline?

If needed I can help run some benchmarks and integration tests with all the testing environments we currently have.

0 replies

atbenmurray · 2023-05-01T06:38:59Z

atbenmurray
May 1, 2023
Collaborator Author

Yes, PR #6257 should come across in its entirety. It has the default lazy mode, the lazy_to_last flag mode, and the restored ability to run invert. I hope to get the commit done by afternoon / late afternoon today.

0 replies

atbenmurray · 2023-05-01T06:40:12Z

atbenmurray
May 1, 2023
Collaborator Author

I note that in the pipeline

[Flip, Spacing, Rand3DElastic, Rotate]

Rand3DElastic doesn't appear in the applied_operations list. Is this intentional?

If not, I can resolve that while I am at it.

0 replies

wyli · 2023-05-01T07:53:10Z

wyli
May 1, 2023
Collaborator

Theres work in progress #1793 for that, but need a major rework to use the latest API

0 replies

atbenmurray · 2023-05-01T08:08:14Z

atbenmurray
May 1, 2023
Collaborator Author

Ok, so it looks like at least some of the pipelines result in moving execution order around on a per key basis. As a result, it won't be possible to have inversion work for that mode without largely rewriting inverse.

Example:

# NX = None lazy transform X
# LY = Lazy transform Y
# AZ = ApplyPending Z
# **i = Transform only operates on key i
[NA, LB, LCx, ADx, NE, LF]

# NA
pending_x: [], applied_x: [NA]
pending_y: [], applied_y: [NA]

# LB
pending_x: [LB], applied_x: [NA]
pending_y: [LB], applied_y: [NA]

# LCx
pending_x: [LB, LCx], applied_x: [NA]
pending_y: [LB], applied_y: [NA]

# ADx
pending_x: [], applied_x: [NA, LB, LCx]
pending_y: [LB], applied_y: [NA]

# NE
pending_x: [], applied_x: [NA, LB, LCx, NE]
pending_y: [LB], applied_y: [NA, NE]

# LF
pending_x: [LF], applied_x: [NA, LB, LCx, NE]
pending_y: [LB, LF], applied_y: [NA, NE]

# final
applied_x: [NA, LB, LCx, NE, LF]  # executes LB before NE
applied_y: [NA, NE, LB, LF]  # executes NE before LB

So I intend to do the following:

Have options=None preserve all lazy ordering
Have options={'reorder': 'lazy_to_end'} that allows for reordering, but done in a way that dictionary pipelines keep a consistent order for all keys. This can be inverted.
Have options={'reorder': 'lazy_to_end_no_invert'} that provides the dev version that breaks the assumption that keys stay in a consistent order. We'll raise a ValueError if you attempt to run inverse on a compose instance with this flag set.

Then for 1.3, I'll have to look at reimplementing the invert mechanism to work with inconsistent ordering on different dictionary keys. It might be that the full lazy implementation already handles this but I'll have to check.

0 replies

atbenmurray · 2023-05-02T10:03:13Z

atbenmurray
May 2, 2023
Collaborator Author

The commits for this are now on #6257

0 replies

atbenmurray · 2023-05-03T08:38:44Z

atbenmurray
May 3, 2023
Collaborator Author

@wyli Ok so there is still an issue.
At the moment, on dev, RandCropByPosNegLabel{d} requires that you use ApplyPending{d} on the labels before you call it, but then calls lazily, so it is added onto pending transforms.

Vanilla lazy mode requires that the user does not have to modify their pipeline in order to run lazy=True, so RandCropByPosNegLabel{d} and similar transforms don't execute lazily, forcing all pending transforms to be applied before it executes.

The only way that I can think to handle this is to have transforms such as RandCropByPosNegLabel{d} perform apply_pending on the label_key themselves before executing in lazy fashion.

0 replies

wyli · 2023-05-03T08:57:33Z

wyli
May 3, 2023
Collaborator

@wyli Ok so there is still an issue. At the moment, on dev, RandCropByPosNegLabel{d} requires that you use ApplyPending{d} on the labels before you call it, but then calls lazily, so it is added onto pending transforms.

Vanilla lazy mode requires that the user does not have to modify their pipeline in order to run lazy=True, so RandCropByPosNegLabel{d} and similar transforms don't execute lazily, forcing all pending transforms to be applied before it executes.

The only way that I can think to handle this is to have transforms such as RandCropByPosNegLabel{d} perform apply_pending on the label_key themselves before executing in lazy fashion.

all those cases are warned via this util:

MONAI/monai/transforms/utils.py

Lines 303 to 307 in 4bfc8e9

def check_non_lazy_pending_ops(

input_array: NdarrayOrTensor, name: None | str = None, raise_error: bool = False

) -> None:

"""

Check whether the input array has pending operations, raise an error or warn when it has.

although it's easy to change the warning into an automatic apply_pending, I don't think the cropping transforms should have the logic of calling apply_pending internally.

For me, warning people isn't enough. lazy with options=None should never break a pipeline.

I'm also not keen on the idea of cropping transforms calling apply_pending internally. It can't happen anyway because the existence of overrides makes that impossible.

Transforms that require it should be able to indicate that they do through the LazyTrait interface. I'm testing something out that solves the issue in the in_order pipeline

0 replies

wyli · 2023-05-03T09:27:33Z

wyli
May 3, 2023
Collaborator

the root cause is that the crop sampling locations are defined in the image space instead of the physical space. in the end the coordinates should be defined in the original physical space and they should carry the geometric information during the preprocessing then there's no ambiguity.

0 replies

Lazy resampling executes non-lazy transforms without executing pending transforms #7370

atbenmurray Apr 28, 2023 Collaborator

Replies: 34 comments

wyli Apr 28, 2023 Collaborator

myron Apr 28, 2023 Maintainer

atbenmurray Apr 29, 2023 Collaborator Author

wyli Apr 29, 2023 Collaborator

atbenmurray Apr 29, 2023 Collaborator Author

atbenmurray Apr 29, 2023 Collaborator Author

atbenmurray Apr 29, 2023 Collaborator Author

wyli Apr 29, 2023 Collaborator

atbenmurray Apr 29, 2023 Collaborator Author

wyli Apr 29, 2023 Collaborator

wyli Apr 29, 2023 Collaborator

atbenmurray Apr 29, 2023 Collaborator Author

atbenmurray Apr 29, 2023 Collaborator Author

wyli Apr 30, 2023 Collaborator

atbenmurray Apr 30, 2023 Collaborator Author

wyli Apr 30, 2023 Collaborator

atbenmurray Apr 30, 2023 Collaborator Author

wyli Apr 30, 2023 Collaborator

atbenmurray Apr 30, 2023 Collaborator Author

wyli Apr 30, 2023 Collaborator

atbenmurray Apr 30, 2023 Collaborator Author

wyli May 1, 2023 Collaborator

atbenmurray May 1, 2023 Collaborator Author

atbenmurray May 1, 2023 Collaborator Author

wyli May 1, 2023 Collaborator

atbenmurray May 1, 2023 Collaborator Author

atbenmurray May 2, 2023 Collaborator Author

atbenmurray May 3, 2023 Collaborator Author

wyli May 3, 2023 Collaborator

wyli May 3, 2023 Collaborator

atbenmurray
Apr 28, 2023
Collaborator

wyli
Apr 28, 2023
Collaborator

myron
Apr 28, 2023
Maintainer

atbenmurray
Apr 29, 2023
Collaborator Author

wyli
Apr 29, 2023
Collaborator

atbenmurray
Apr 29, 2023
Collaborator Author

atbenmurray
Apr 29, 2023
Collaborator Author

atbenmurray
Apr 29, 2023
Collaborator Author

wyli
Apr 29, 2023
Collaborator

atbenmurray
Apr 29, 2023
Collaborator Author

wyli
Apr 29, 2023
Collaborator

wyli
Apr 29, 2023
Collaborator

atbenmurray
Apr 29, 2023
Collaborator Author

atbenmurray
Apr 29, 2023
Collaborator Author

wyli
Apr 30, 2023
Collaborator

atbenmurray
Apr 30, 2023
Collaborator Author

wyli
Apr 30, 2023
Collaborator

atbenmurray
Apr 30, 2023
Collaborator Author

wyli
Apr 30, 2023
Collaborator

atbenmurray
Apr 30, 2023
Collaborator Author

wyli
Apr 30, 2023
Collaborator

atbenmurray
Apr 30, 2023
Collaborator Author

wyli
May 1, 2023
Collaborator

atbenmurray
May 1, 2023
Collaborator Author

atbenmurray
May 1, 2023
Collaborator Author

wyli
May 1, 2023
Collaborator

atbenmurray
May 1, 2023
Collaborator Author

atbenmurray
May 2, 2023
Collaborator Author

atbenmurray
May 3, 2023
Collaborator Author

wyli
May 3, 2023
Collaborator

wyli
May 3, 2023
Collaborator