Skip to content

Latest commit

 

History

History
71 lines (60 loc) · 15.2 KB

MODELHUB.md

File metadata and controls

71 lines (60 loc) · 15.2 KB

Access code for baidu is swin.

ImageNet-1K and ImageNet-22K Pretrained Swin-V1 Models

name pretrain resolution acc@1 acc@5 #params FLOPs FPS 22K model 1K model
Swin-T ImageNet-1K 224x224 81.2 95.5 28M 4.5G 755 - github/baidu/config/log
Swin-S ImageNet-1K 224x224 83.2 96.2 50M 8.7G 437 - github/baidu/config/log
Swin-B ImageNet-1K 224x224 83.5 96.5 88M 15.4G 278 - github/baidu/config/log
Swin-B ImageNet-1K 384x384 84.5 97.0 88M 47.1G 85 - github/baidu/config
Swin-T ImageNet-22K 224x224 80.9 96.0 28M 4.5G 755 github/baidu/config github/baidu/config
Swin-S ImageNet-22K 224x224 83.2 97.0 50M 8.7G 437 github/baidu/config github/baidu/config
Swin-B ImageNet-22K 224x224 85.2 97.5 88M 15.4G 278 github/baidu/config github/baidu/config
Swin-B ImageNet-22K 384x384 86.4 98.0 88M 47.1G 85 github/baidu github/baidu/config
Swin-L ImageNet-22K 224x224 86.3 97.9 197M 34.5G 141 github/baidu/config github/baidu/config
Swin-L ImageNet-22K 384x384 87.3 98.2 197M 103.9G 42 github/baidu github/baidu/config

ImageNet-1K and ImageNet-22K Pretrained Swin-V2 Models

name pretrain resolution window acc@1 acc@5 #params FLOPs FPS 22K model 1K model
SwinV2-T ImageNet-1K 256x256 8x8 81.8 95.9 28M 5.9G 572 - github/baidu/config
SwinV2-S ImageNet-1K 256x256 8x8 83.7 96.6 50M 11.5G 327 - github/baidu/config
SwinV2-B ImageNet-1K 256x256 8x8 84.2 96.9 88M 20.3G 217 - github/baidu/config
SwinV2-T ImageNet-1K 256x256 16x16 82.8 96.2 28M 6.6G 437 - github/baidu/config
SwinV2-S ImageNet-1K 256x256 16x16 84.1 96.8 50M 12.6G 257 - github/baidu/config
SwinV2-B ImageNet-1K 256x256 16x16 84.6 97.0 88M 21.8G 174 - github/baidu/config
SwinV2-B* ImageNet-22K 256x256 16x16 86.2 97.9 88M 21.8G 174 github/baidu/config github/baidu/config
SwinV2-B* ImageNet-22K 384x384 24x24 87.1 98.2 88M 54.7G 57 github/baidu/config github/baidu/config
SwinV2-L* ImageNet-22K 256x256 16x16 86.9 98.0 197M 47.5G 95 github/baidu/config github/baidu/config
SwinV2-L* ImageNet-22K 384x384 24x24 87.6 98.3 197M 115.4G 33 github/baidu/config github/baidu/config

Note:

  • SwinV2-B* (SwinV2-L*) with input resolution of 256x256 and 384x384 both fine-tuned from the same pre-training model using a smaller input resolution of 192x192.
  • SwinV2-B* (384x384) achieves 78.08 acc@1 on ImageNet-1K-V2 while SwinV2-L* (384x384) achieves 78.31.

ImageNet-1K Pretrained Swin MLP Models

name pretrain resolution acc@1 acc@5 #params FLOPs FPS 1K model
Mixer-B/16 ImageNet-1K 224x224 76.4 - 59M 12.7G - official repo
ResMLP-S24 ImageNet-1K 224x224 79.4 - 30M 6.0G 715 timm
ResMLP-B24 ImageNet-1K 224x224 81.0 - 116M 23.0G 231 timm
Swin-T/C24 ImageNet-1K 256x256 81.6 95.7 28M 5.9G 563 github/baidu/config
SwinMLP-T/C24 ImageNet-1K 256x256 79.4 94.6 20M 4.0G 807 github/baidu/config
SwinMLP-T/C12 ImageNet-1K 256x256 79.6 94.7 21M 4.0G 792 github/baidu/config
SwinMLP-T/C6 ImageNet-1K 256x256 79.7 94.9 23M 4.0G 766 github/baidu/config
SwinMLP-B ImageNet-1K 224x224 81.3 95.3 61M 10.4G 409 github/baidu/config

Note: C24 means each head has 24 channels.

ImageNet-22K Pretrained Swin-MoE Models

name #experts k router resolution window IN-22K acc@1 IN-1K/ft acc@1 IN-1K/5-shot acc@1 22K model
Swin-MoE-S 1 (dense) - - 192x192 8x8 35.5 83.5 70.3 github/baidu/config
Swin-MoE-S 8 1 Linear 192x192 8x8 36.8 84.5 75.2 github/baidu/config
Swin-MoE-S 16 1 Linear 192x192 8x8 37.6 84.9 76.5 github/baidu/config
Swin-MoE-S 32 1 Linear 192x192 8x8 37.4 84.7 75.9 github/baidu/config
Swin-MoE-S 32 1 Cosine 192x192 8x8 37.2 84.3 75.2 github/baidu/config
Swin-MoE-S 64 1 Linear 192x192 8x8 37.8 84.7 75.7 -
Swin-MoE-S 128 1 Linear 192x192 8x8 37.4 84.5 75.4 -
Swin-MoE-B 1 (dense) - - 192x192 8x8 37.3 85.1 75.9 config
Swin-MoE-B 8 1 Linear 192x192 8x8 38.1 85.3 77.2 config
Swin-MoE-B 16 1 Linear 192x192 8x8 38.7 85.5 78.2 config
Swin-MoE-B 32 1 Linear 192x192 8x8 38.6 85.5 77.9 config
Swin-MoE-B 32 1 Cosine 192x192 8x8 38.5 85.3 77.3 config
Swin-MoE-B 32 2 Linear 192x192 8x8 38.6 85.5 78.7 -