Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spacingd inverse transformation fills Grad-CAM tensor with all zero #7392

Closed
dongdongtong opened this issue Jan 16, 2024 · 1 comment
Closed

Comments

@dongdongtong
Copy link

dongdongtong commented Jan 16, 2024

Describe the bug
I want to overlay the grad-cam tensor (the value is obtained by the monai gradcam++ class) on the original NIFTI image. So I want to utilize the image applied_operations of pre-performed transforms and inverse the grad-cam tensor. However, when I do this, everything performs well except the Spacingd.inverse fill the values of the tensor with 0.

To Reproduce
Steps to reproduce the behavior:

  1. construct a set of monai transforms as follows
import torch
import torch.nn.functional as F

import numpy as np

from monai import data, transforms
from monai.transforms.inverse import InvertibleTransform


class ZScoreNormd(transforms.transform.MapTransform, InvertibleTransform):
    def __init__(self, keys, allow_missing_keys: bool = False):
        super().__init__(keys, allow_missing_keys)
    
    def z_score_norm(self, array):
        # print("ZScoreNormd:", array.shape, array.min(), array.max())
        back_arr = array.clone()
        arr_low, arr_high = np.percentile(array, [0.5, 99.5])
        array = torch.clamp(array, arr_low, arr_high)
        arr_mean = np.mean(array[back_arr != 0])
        arr_std = np.std(array[back_arr != 0])
        
        array = (array - arr_mean) / arr_std
        array[back_arr == 0] = 0

        return array
    
    def __call__(self, data):
        d = dict(data)
        d[self.keys[0]] = self.z_score_norm(d[self.keys[0]])
        return d

    def inverse(self, data):
        d = dict(data)
        # print("ZScoreNormd inverse: ", d['image'].min(), d['image'].max())
        return d


class CropNonZerod(transforms.transform.MapTransform, InvertibleTransform):
    """This transform requires the data is of channel-first shape (channels, H, W, D)"""
    def __init__(self, keys, reader_type="nibabel", allow_missing_keys: bool = False):
        super().__init__(keys, allow_missing_keys)
        self.reader_type = reader_type
    
    def get_image_slicer_to_crop(self, nonzero_mask):
        outside_value = 0
        w, h, d = nonzero_mask.shape
        mask_voxel_coords = np.where(nonzero_mask != outside_value)
        minzidx = int(np.min(mask_voxel_coords[0]))
        maxzidx = int(np.max(mask_voxel_coords[0])) + 1
        minxidx = int(np.min(mask_voxel_coords[1]))
        maxxidx = int(np.max(mask_voxel_coords[1])) + 1
        minyidx = int(np.min(mask_voxel_coords[2]))
        maxyidx = int(np.max(mask_voxel_coords[2])) + 1
        bbox = [[minzidx, maxzidx], [minxidx, maxxidx], [minyidx, maxyidx]]
        self.inverse_pad_sequence = (bbox[0][0], w - bbox[0][1], bbox[1][0], h - bbox[1][1], bbox[2][0], d - bbox[2][1])
        # print('CropNonZerod get_image_slicer_to_crop bbox', bbox)
        # print('CropNonZerod get_image_slicer_to_crop self.inverse_pad_sequence', self.inverse_pad_sequence)

        resizer = (slice(0, 1), slice(bbox[0][0], bbox[0][1]), slice(bbox[1][0], bbox[1][1]), slice(bbox[2][0], bbox[2][1]))
        return resizer
    
    def crop_z_nibabel(self, img:data.MetaTensor, brainseg:data.MetaTensor = None):
        resizer = self.get_image_slicer_to_crop(img.numpy()[0])
        # print(z_end - z_start)

        if brainseg is not None:
            return img[resizer], brainseg[resizer]
        else:
            return img[resizer]
    
    def __call__(self, data):
        # assert len(self.keys) == 2, "keys must containing both img and brainseg"
        d = dict(data)
        if len(self.keys) == 2:
            d[self.keys[0]], d[self.keys[1]] = self.crop_z_nibabel(d[self.keys[0]], d[self.keys[1]])
        else:
            d[self.keys[0]] = self.crop_z_nibabel(d[self.keys[0]])
        return d
    
    def inverse(self, data):
        d = dict(data)
        for key in self.key_iterator(d):
            d[key] = F.pad(d[key], self.inverse_pad_sequence[::-1], mode="constant")
        return d

class AdjustHWRatioAndResized(transforms.transform.MapTransform, InvertibleTransform):
    """This transform requires the data is of channel-first shape (channels, H, W, D)"""
    def __init__(self, keys, img_size, allow_missing_keys: bool = False,):
        super().__init__(keys, allow_missing_keys)
        self.img_size = img_size

    def adjust_ratio_with_pad(self, array, brainseg=None):
        c, w, h, d = array.shape
        target_ratio = self.img_size[1] / self.img_size[0]
        # 1.173909472
        cur_ratio = h / w
        if cur_ratio > target_ratio:
            need_w = h / target_ratio
            need_h = h
        elif cur_ratio < target_ratio:
            need_h = w * target_ratio
            need_w = w
        else:
            need_h = h
            need_w = w
        
        self.padder = transforms.SpatialPad(spatial_size=(need_w, need_h, -1))

        padded_array = self.padder(array)

        if brainseg is not None:
            padded_brainseg = self.padder(brainseg)

            return padded_array, padded_brainseg
        else:
            return padded_array
    
    def __call__(self, data):
        d = dict(data)
        if len(self.keys) == 2:
            d[self.keys[0]], d[self.keys[1]] = self.adjust_ratio_with_pad(d[self.keys[0]], d[self.keys[1]])
        else:
            d[self.keys[0]] = self.adjust_ratio_with_pad(d[self.keys[0]])
                
        return d
    
    def inverse(self, data):
        d = dict(data)
        for key in self.key_iterator(d):
            # print("Before AdjustHWRatioAndResized inverse: ", d[key].min(), d[key].max())
            d[key] = self.padder.inverse(d[key])
            # print("After AdjustHWRatioAndResized inverse: ", d[key].min(), d[key].max())
        return d


img_size = (320, 352, 96)
need_keys = ['image']
valid_transforms = transforms.Compose([
        transforms.LoadImaged(keys=need_keys),
        transforms.EnsureChannelFirstd(keys=need_keys),
        transforms.Orientationd(keys=need_keys, axcodes="RAS"),
        CropNonZerod(keys=need_keys),
        transforms.Spacingd(keys=need_keys, pixdim=(0.532407402992249, 0.532407402992249, 1.25), mode=['bilinear']),
        ZScoreNormd(keys=need_keys),
        AdjustHWRatioAndResized(keys=need_keys, img_size=img_size),
        transforms.Resized(keys=need_keys, spatial_size=img_size, mode=['bilinear']), 
    ])


import os
import copy
import numpy as np
from PIL import Image
import matplotlib.cm as mpl_color_map


def plot_img_gradcam_overlay(img: np.ndarray, gradcam_img: np.ndarray, save_path: str, alpha=0.4, margin=1, nrow=8):
    # img shape: DHW and range of [0, ll, gradcam img shape: DHW and value range of [0, 1)
    img = (img - img.min()) / (img.max() - img.min())
    gradcam_img = (gradcam_img - gradcam_img.min()) / (gradcam_img.max() - gradcam_img.min())
    assert img.min() >= 0 and img.max() <= 1, "value range of img must in [0, 1]"
    assert gradcam_img.min() >= 0 and gradcam_img.max() <= 1, "value range of gradcam img must in [0, 1]"
    d, h, w = img.shape
    # colorize the gradcam with cmap,output is DHWC,C=3
    contour_map = mpl_color_map.get_cmap("jet")
    heatmap_rgb_arr = contour_map(gradcam_img)[..., :3] * 255
    
    # img with rgb channel
    img_rgb_arr = np.tile(img, (3, 1, 1, 1)).transpose((1, 2, 3, 0)) * 255
    
    # blend img and gradcam
    blended_img_arr = img_rgb_arr * (1 - alpha) + heatmap_rgb_arr * alpha
    # print(heatmap rgb arr.shape, img rgb arr.shape, blended img arr.shape)
    # os.makedirs(os.path.dirname(save path), exist ok=True)# Image.fromarray(np.concatenate(blended img arr, axis=l).astype(np.uint8)).save(save path)
    # make the final matrix for visualization
    final_img_h = (nrow - 1) * margin + h * nrow
    ncol = np.ceil(d / nrow)
    final_img_w = (ncol - 1) * margin + w * ncol
    final_img_arr = np.zeros((int(final_img_h), int(final_img_w), 3))
    for idx, blend_slice in enumerate(blended_img_arr):
        # printlidx)
        this_row = idx // ncol
        this_col = idx % ncol
        
        real_coord_row = int(this_row * (h + margin))
        real_coord_col = int(this_col * (w + margin))
        
        final_img_arr[real_coord_row: real_coord_row + h, real_coord_col: real_coord_col + w, :] = \
            blend_slice
    
    os.makedirs(os.path.dirname(save_path), exist_ok=True)
    Image.fromarray(final_img_arr.astype(np.uint8)).save(save_path)


data_json = {"image": "./img.nii.gz"}
grad_cam = np.load("./grad_cam.npy")

tfed_dict = valid_transforms(data_json)

# plot grad cam here, we transform loaded image to d, h, w
plot_img_gradcam_overlay(
    tfed_dict['image'][0].permute(2, 1, 0).numpy(), 
    grad_cam.transpose(2, 1, 0), 
    save_path="./grad_cam.png"
)


# inverse transform, transform grad_cam back to the NIFTI space
from monai.data import MetaTensor
from monai.transforms.utils import allow_missing_keys_mode
# with torch.no_grad():
result_cam_metatensor = MetaTensor(grad_cam[None, ...], applied_operations=tfed_dict["image"].applied_operations)
# result_cam.applied_operations = batch["image"].applied_operations
cam_dict = {"image": result_cam_metatensor}
with allow_missing_keys_mode(valid_transforms):
    inverted_cam_dict = valid_transforms.inverse(cam_dict)

  1. download the files.zip and put them at the proper location
  2. run the code in the above

Expected behavior
The proper transformation of grad_cam to the right location.

Screenshots
image
image

MONAI debug output

================================
Printing MONAI config...
================================
MONAI version: 1.3.0
Numpy version: 1.22.3
Pytorch version: 1.11.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 865972f7a791bf7b42efbcd87c8402bd865b329e
MONAI __file__: /home/<username>/anaconda3/envs/pt1.11/lib/python3.10/site-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 3.2.2
scikit-image version: 0.19.2
scipy version: 1.7.3
Pillow version: 9.0.1
Tensorboard version: 2.9.0
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.12.0
tqdm version: 4.64.0
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: 5.9.1
pandas version: 1.4.2
einops version: 0.7.0
transformers version: NOT INSTALLED or UNKNOWN VERSION.
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNKNOWN VERSION.
clearml version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 20.04.3 LTS
Platform: Linux-5.8.0-53-generic-x86_64-with-glibc2.31
Processor: x86_64
Machine: x86_64
Python version: 3.10.4
Process name: python
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: [popenfile(path='/home/dingsd/.vscode-server/data/logs/20240116T091054/remoteagent.log', fd=19, position=27874, mode='a', flags=33793), popenfile(path='/home/dingsd/.vscode-server/data/logs/20240116T091054/ptyhost.log', fd=20, position=8040, mode='a', flags=33793), popenfile(path='/home/dingsd/.vscode-server/data/logs/20240116T091054/network.log', fd=27, position=0, mode='a', flags=33793), popenfile(path='/home/dingsd/.vscode-server/bin/0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2/vscode-remote-lock.dingsd.0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2', fd=99, position=0, mode='w', flags=32769)]
Num physical CPUs: 12
Num logical CPUs: 24
Num usable CPUs: 24
CPU usage (%): [1.2, 0.0, 0.0, 2.4, 0.6, 0.0, 1.2, 1.2, 3.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.6, 0.0, 1.2, 1.8, 1.2, 0.0, 1.8, 0.6, 0.0, 100.0]
CPU freq. (MHz): 1108
Load avg. in last 1, 5, 15 mins (%): [1.0, 0.7, 0.5]
Disk usage (%): 32.5
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 125.5
Available memory (GB): 119.5
Used memory (GB): 4.7

================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 11.3
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 8200
Current device: 0
Library compiled for CUDA architectures: ['sm_37', 'sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_37']
GPU 0 Name: NVIDIA GeForce RTX 3090
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 82
GPU 0 Total memory (GB): 23.7
GPU 0 CUDA capability (maj.min): 8.6

Environment

Pytorch: 1.11.0
ubuntu: 20.04
monai: 1.3.0

files can be found here files.zip
ii.gz…]()

@dongdongtong dongdongtong closed this as not planned Won't fix, can't repro, duplicate, stale Jan 16, 2024
@dongdongtong dongdongtong reopened this Jan 16, 2024
@dongdongtong dongdongtong changed the title Spacingd fills Grad-CAM tensor with all zero Spacingd inverse transformation fills Grad-CAM tensor with all zero Jan 16, 2024
@KumoLiu
Copy link
Contributor

KumoLiu commented Jan 16, 2024

Hi @dongdongtong, thanks for your interest here.
You should also specify the affine to the gradcam map.
Something like:

result_cam_metatensor = MetaTensor(grad_cam[None, ...], applied_operations=tfed_dict["image"].applied_operations, affine=tfed_dict['image'].affine)

Thanks.

Convert to the discussion, feel free to create another one if meet any issues

@Project-MONAI Project-MONAI locked and limited conversation to collaborators Jan 16, 2024
@KumoLiu KumoLiu converted this issue into discussion #7393 Jan 16, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants