You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use another backbone which is not yolo family. And I find it difficult to modify the config, specifically, embedding channels and head numbers of neck and head, eg. embed_channels, in_channels, out_channels in neck. Are there any reasons to determine the parameters?
For example, I am using the config file yolo_world_v2_x_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py.
The size of the feature embedding from the other backbone is [torch.Size(16, 96, 160, 160), torch.Size(16, 192, 80, 80), torch.Size(16, 384, 40, 40), torch.Size(16, 768, 20, 20)], while the feature from yolov8 (the original code) is [torch.Size([16, 320, 80, 80]), torch.Size([16, 640, 40, 40]), torch.Size([16, 640, 20, 20])], where 16 is the batch size in both cases and the image scale is [680,680]. embed_channels=[ 128, 256, 256, ], in_channels=[ 256, 512, 512, ], out_channels=[ 256, 512, 512, ], num_heads=[ 4, 8, 8, ]
how to modify the parameters or for what reasons these parameters are determined.
The text was updated successfully, but these errors were encountered:
I'm trying to use another backbone which is not yolo family. And I find it difficult to modify the config, specifically, embedding channels and head numbers of neck and head, eg.
embed_channels, in_channels, out_channels
inneck
. Are there any reasons to determine the parameters?For example, I am using the config file
yolo_world_v2_x_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py
.The size of the feature embedding from the other backbone is
[torch.Size(16, 96, 160, 160), torch.Size(16, 192, 80, 80), torch.Size(16, 384, 40, 40), torch.Size(16, 768, 20, 20)]
, while the feature from yolov8 (the original code) is[torch.Size([16, 320, 80, 80]), torch.Size([16, 640, 40, 40]), torch.Size([16, 640, 20, 20])]
, where16
is the batch size in both cases and the image scale is[680,680].
embed_channels=[ 128, 256, 256, ], in_channels=[ 256, 512, 512, ], out_channels=[ 256, 512, 512, ], num_heads=[ 4, 8, 8, ]
how to modify the parameters or for what reasons these parameters are determined.
The text was updated successfully, but these errors were encountered: