Extend/correct behaviour of RandAffined to allow for different input shapes #4491

josegcpa · 2022-06-13T16:19:12Z

Hello!

Is your feature request related to a problem? Please describe.
When using different shapes (i.e. predicting a downsampled segmentation map or an upsampled, high resolution image) it is often the case that affine rotations can be quite handy. However, the current implementation of MONAI's RandAffined does not enable users to apply these transforms to different sized inputs, often leading to mismatches as MONAI assumes that the shape of all inputs corresponds to the shape of the input of the first key.

Describe the solution you'd like
What I would like is, ideally, to have a random affine augmentation that is able to handle such cases, i.e. differently shaped inputs to be rotated, scaled, translated and sheared in the same fashion without altering the output shape of the input tensors.

Describe alternatives you've considered
The easiest alternative would be reimplement this class, which is what I am presently doing. The main issue here is, obviously, the translation which requires some additional thought (one cannot apply the same 10 voxel translation to an image and its downsampled version). Other than that, and assuming that rotations assume the center of the image as the center of rotation, I believe that this works relatively well (I may be wrong). The best part is that you can easily build on monai.transforms.Affine. Additionally, one can, in any case, provide the necessary spatial sizes in order. The example below does this.

class RandomAffined(monai.transforms.RandomizableTransform):
    def __init__(
        self,
        keys:List[str],
        spatial_sizes:List[Union[Tuple[int,int,int],Tuple[int,int]]],
        mode:List[str],
        prob:float=0.1,
        rotate_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        shear_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        translate_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        scale_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        device:"str"="cpu"):

        self.keys = keys
        self.spatial_sizes = [np.array(s,dtype=np.int32) for s in spatial_sizes]
        self.mode = mode
        self.prob = prob
        self.rotate_range = np.array(rotate_range)
        self.shear_range = np.array(shear_range)
        self.translate_range = np.array(translate_range)
        self.scale_range = np.array(scale_range)
        self.device = device

        self.affine_trans = {
            k:monai.transforms.Affine(
                spatial_size=s,
                mode=m,
                device=self.device)
            for k,s,m in zip(self.keys,self.spatial_sizes,self.mode)}
        
        self.get_translation_adjustment()

    def get_random_parameters(self):
        angle = self.R.uniform(
            -self.rotate_range,self.rotate_range)
        shear = self.R.uniform(
            -self.shear_range,self.shear_range)
        trans = self.R.uniform(
            -self.translate_range,self.translate_range)
        scale = self.R.uniform(
            1-self.scale_range,1+self.scale_range)

        return angle,shear,trans,scale
    
    def get_translation_adjustment(self):
        # we have to adjust the translation to ensure that all inputs
        # do not become misaligned. to do this I assume that the first image
        # is the reference
        ref_size = self.spatial_sizes[0]
        self.trans_adj = {
            k:s/ref_size
            for k,s in zip(self.keys,self.spatial_sizes)}
    
    def randomize(self):
        angle,shear,trans,scale = self.get_random_parameters()
        for k in self.affine_trans:
            # we only need to update the affine grid
            self.affine_trans[k].affine_grid = monai.transforms.AffineGrid(
                rotate_params=list(angle),
                shear_params=list(shear),
                translate_params=list(trans),
                scale_params=list(np.float32(scale*self.trans_adj[k])),
                device=self.device)

    def __call__(self,data):
        self.randomize()
        for k in self.keys:
            if self.R.uniform() < self.prob:
                transform = self.affine_trans[k]
                data[k],data[k+"_affine"] = transform(data[k])
        return data

Additional context
This is mostly it, just a random affine augmentation that can handle inputs of different sizes. Below I provide a reproducible example and the (unexpected IMO) output of the MONAI function and the output of the function noted above.


import sys
import os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), '..'))
import numpy as np
import time
import torch
import monai

from typing import List,Union,Tuple

class RandomAffined(monai.transforms.RandomizableTransform):
    def __init__(
        self,
        keys:List[str],
        spatial_sizes:List[Union[Tuple[int,int,int],Tuple[int,int]]],
        mode:List[str],
        prob:float=0.1,
        rotate_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        shear_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        translate_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        scale_range:Union[Tuple[int,int,int],Tuple[int,int]]=[0,0,0],
        device:"str"="cpu"):

        self.keys = keys
        self.spatial_sizes = [np.array(s,dtype=np.int32) for s in spatial_sizes]
        self.mode = mode
        self.prob = prob
        self.rotate_range = np.array(rotate_range)
        self.shear_range = np.array(shear_range)
        self.translate_range = np.array(translate_range)
        self.scale_range = np.array(scale_range)
        self.device = device

        self.affine_trans = {
            k:monai.transforms.Affine(
                spatial_size=s,
                mode=m,
                device=self.device)
            for k,s,m in zip(self.keys,self.spatial_sizes,self.mode)}
        
        self.get_translation_adjustment()

    def get_random_parameters(self):
        angle = self.R.uniform(
            -self.rotate_range,self.rotate_range)
        shear = self.R.uniform(
            -self.shear_range,self.shear_range)
        trans = self.R.uniform(
            -self.translate_range,self.translate_range)
        scale = self.R.uniform(
            1-self.scale_range,1+self.scale_range)

        return angle,shear,trans,scale
    
    def get_translation_adjustment(self):
        # we have to adjust the translation to ensure that all inputs
        # do not become misaligned. to do this I assume that the first image
        # is the reference
        ref_size = self.spatial_sizes[0]
        self.trans_adj = {
            k:s/ref_size
            for k,s in zip(self.keys,self.spatial_sizes)}
    
    def randomize(self):
        angle,shear,trans,scale = self.get_random_parameters()
        for k in self.affine_trans:
            # we only need to update the affine grid
            self.affine_trans[k].affine_grid = monai.transforms.AffineGrid(
                rotate_params=list(angle),
                shear_params=list(shear),
                translate_params=list(trans),
                scale_params=list(np.float32(scale*self.trans_adj[k])),
                device=self.device)

    def __call__(self,data):
        self.randomize()
        for k in self.keys:
            if self.R.uniform() < self.prob:
                transform = self.affine_trans[k]
                data[k],data[k+"_affine"] = transform(data[k])
        return data

input_shape = [1,128,128,16]
input_shape_ds = [1,64,64,8]

data = {
    "a":torch.rand(input_shape),
    "b":torch.rand(input_shape_ds)}

t = monai.transforms.RandAffined(
    keys=["a","b"],
    mode=["bilinear","nearest"],
    prob=1.0,
    rotate_range=[np.pi/6,np.pi/6,np.pi/6],
    shear_range=[0,0,0],
    translate_range=[10,10,3],
    scale_range=[0.1,0.1,0.1])

t_1 = time.time()
o = t(data)
t_2 = time.time()

print("Result from MONAI:",o['a'].shape,o['b'].shape)
print("\tTime elapsed:",t_2-t_1)

t = RandomAffined(
    keys=["a","b"],
    spatial_sizes=[input_shape[1:],input_shape_ds[1:]],
    mode=["bilinear","nearest"],
    prob=1.0,
    rotate_range=[np.pi/6,np.pi/6,np.pi/6],
    shear_range=[0,0,0],
    translate_range=[10,10,3],
    scale_range=[0.1,0.1,0.1])

t_1 = time.time()
o = t(data)
t_2 = time.time()

print("Result from own implementaion",o['a'].shape,o['b'].shape)
print("\tTime elapsed:",t_2-t_1)

Output:

Result from MONAI: torch.Size([1, 128, 128, 16]) torch.Size([1, 128, 128, 16])
        Time elapsed: 0.0467381477355957
Result from own implementaion torch.Size([1, 128, 128, 16]) torch.Size([1, 64, 64, 8])
        Time elapsed: 0.02890753746032715

I understand that MONAI has specific coding conventions that I did not follow here (this being the reason why I did not submit a request for changes) but I hope this is clear enough.

The text was updated successfully, but these errors were encountered:

wyli · 2022-06-15T10:27:26Z

thanks for the feature request, the root cause is that the current rotate_range translate_range are defined in terms of the image coordinates, but in this use case, the assumption is that images at different scales share the same world coordinate system. RandAffine should have an option to interpret rotate_range translate_range with respect to the world coordinate, and according to the voxel-to-world transform (provided as image meta info) we can transform the pixel values in a consistent manner. we are in the process of releasing MetaTensor API which tracks the voxel-to-world transform and will include this feature.

josegcpa · 2022-06-20T09:54:25Z

@wyli yes, that is correct, there is a strong underlying assumption behind this application - that images share identical world coordinates (I must stress here that this is also the case if images are of equal sizes). This can be useful for cases were the user is absolutely certain that both images are co-registered or, for my use case, that the smaller image simply corresponds to a downsampled segmentation map of the input version. This can also be useful in cases where an algorithm is being trained for super-resolution applications.

I agree that the more generic use case would involve using metadata to interpret world coordinates and ensure that transforms are cohesive between images but I would not consider this to be a necessity - in some cases there is prior knowledge that the world coordinates are identical (even if under/upsampled). I am looking forward for the MetaTensor API, it does seem to greatly simplify how transforms are currently handling this.

vikashg · 2024-01-05T14:32:04Z

closing because of inactivity.

wyli mentioned this issue Jul 4, 2022

Random cropping with scaling option for super-resolution data augmentation #4624

Open

KumoLiu added the Feature request label Dec 7, 2023

vikashg closed this as completed Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend/correct behaviour of RandAffined to allow for different input shapes #4491

Extend/correct behaviour of RandAffined to allow for different input shapes #4491

josegcpa commented Jun 13, 2022

wyli commented Jun 15, 2022

josegcpa commented Jun 20, 2022

vikashg commented Jan 5, 2024

Extend/correct behaviour of RandAffined to allow for different input shapes #4491

Extend/correct behaviour of RandAffined to allow for different input shapes #4491

Comments

josegcpa commented Jun 13, 2022

wyli commented Jun 15, 2022

josegcpa commented Jun 20, 2022

vikashg commented Jan 5, 2024