Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong default value for conv1_t_stride in ResNet __init__ #7572

Open
k-sukharev opened this issue Mar 24, 2024 · 3 comments
Open

Wrong default value for conv1_t_stride in ResNet __init__ #7572

k-sukharev opened this issue Mar 24, 2024 · 3 comments
Labels
community community issue

Comments

@k-sukharev
Copy link
Contributor

k-sukharev commented Mar 24, 2024

During the implementation for ResNetEncoder for FlexibleUNet, I encountered a bug related to the default value for conv1_t_stride in ResNet.__init__. Upon investigation, it became evident that the default value should be 2 instead of 1 .

Also, stride for the first convolution is 2 in the MedicalNet repository. So I suppose all 3D pretrained ResNet models for classification at this moment work not as intended.

@k-sukharev
Copy link
Contributor Author

Is changing the default value for conv1_t_stride to 2 a breaking change? If not, I can make a pull request.

@KumoLiu
Copy link
Contributor

KumoLiu commented Apr 17, 2024

Hi @k-sukharev, thanks for reporting this one.
I believe we refer to the implementation here: https://github.com/kenshohara/3D-ResNets-PyTorch/blob/master/models/resnet.py#L110.

In the original ResNet paper, "Deep Residual Learning for Image Recognition", authors propose different model designs for images of different sizes. For smaller input images, like the 32x32 pixels images in the CIFAR10 dataset, the paper suggests setting the stride of the first convolutional layer (conv1) to 1. However, for larger input images, like the 224x224 pixels images in the ImageNet dataset, the stride of conv1 is set to 2 in the original design. This helps to reduce computational consumption, and ensures sufficient receptive field for larger images.
For smaller inputs, we might want to set the stride to 1 to preserve more spatial information, while for larger inputs, a stride of 2 reduces computational consumption and increases efficiency.
In addition, changing the default stride can indeed have an impact.

cc @Douwe-Spaanderman, as he is the original contributor to this one.
cc @ericspod @atbenmurray @Nic-Ma for vis.

@KumoLiu KumoLiu added the community community issue label Apr 17, 2024
@k-sukharev
Copy link
Contributor Author

@KumoLiu, thanks for the detailed answer.

I agree that conv1_t_stride can be 1 and that's ok.

But this does not change the fact that MedicalNet was trained with conv1_t_stride equal to 2, and monai uses these pre-trained weights for models with conv1_t_stride equal to 1 by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community community issue
Projects
None yet
Development

No branches or pull requests

2 participants