You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the implementation for ResNetEncoder for FlexibleUNet, I encountered a bug related to the default value for conv1_t_stride in ResNet.__init__. Upon investigation, it became evident that the default value should be 2 instead of 1 .
Also, stride for the first convolution is 2 in the MedicalNet repository. So I suppose all 3D pretrained ResNet models for classification at this moment work not as intended.
The text was updated successfully, but these errors were encountered:
In the original ResNet paper, "Deep Residual Learning for Image Recognition", authors propose different model designs for images of different sizes. For smaller input images, like the 32x32 pixels images in the CIFAR10 dataset, the paper suggests setting the stride of the first convolutional layer (conv1) to 1. However, for larger input images, like the 224x224 pixels images in the ImageNet dataset, the stride of conv1 is set to 2 in the original design. This helps to reduce computational consumption, and ensures sufficient receptive field for larger images.
For smaller inputs, we might want to set the stride to 1 to preserve more spatial information, while for larger inputs, a stride of 2 reduces computational consumption and increases efficiency.
In addition, changing the default stride can indeed have an impact.
I agree that conv1_t_stride can be 1 and that's ok.
But this does not change the fact that MedicalNet was trained with conv1_t_stride equal to 2, and monai uses these pre-trained weights for models with conv1_t_stride equal to 1 by default.
During the implementation for ResNetEncoder for FlexibleUNet, I encountered a bug related to the default value for
conv1_t_stride
inResNet.__init__
. Upon investigation, it became evident that the default value should be2
instead of1
.Also, stride for the first convolution is
2
in the MedicalNet repository. So I suppose all 3D pretrained ResNet models for classification at this moment work not as intended.The text was updated successfully, but these errors were encountered: