-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about ablation studies in paper #3
Comments
Hi, thank you for interesting in our work.
|
Oh, that means we don't need to use Thanks for your fast reply. |
My pleasure. |
Hi, thanks for your good work! |
Hi, the implementation is tricky and making it public needs a lot of work. I can tell you roughly how I achieved it and share some pieces of the code. The main idea is using the training phase to get the ground-truth values.
Hope this helps and good luck. |
Hi @hyz-xmaster , for comparisons in table5, does the VFL mean the loss in equation 2 or means that loss in equation 2 + Star Dconv + BBox refinement? |
Hi @feiyuhuahuo, in Table 5, VFL means only the loss. In the last row, VFNet + VFL = VFL loss + Star DConv + BBox refinement, and similarly VFNet + FL = FL loss + Star DConv + BBox refinement. |
@hyz-xmaster Hi, I'm confused about the comparison in Table 1 and Table 3. Is the first row |
Hi @youngwanLEE, raw VFL (39.0 AP) in Table 3 has the same structure with FCOS+ATSS without centerness branch and it is retrained. FCOS + ATSS w/o ctr (38.5 AP) in Table 1 means FCOS + ATSS is trained with the centerness branch but centerness scores are not used in inference. So the difference is if the centerness branch is used in training. |
@hyz-xmaster , hi, i wonder why you use log function for iou_logits and centerness when replacing the value with ground-truth targets in example code. |
Hi @yingyu13 , |
Thanks! I got it. |
How to merge those per-image results and convert into the format that the evaluation code requires? How to align each per-image results with corresponding gt bboxes? I have successfully reimplement the stage 1-4, but get stacked at stage 5. Could you provide corresponding code of stage 5? |
Hi, this is the code for my experiment which was done with mmdetection v1.1. You may write your own based on it. Hope this helps.
|
Thanks! |
Why the result is arranged by img_id (I mean img_ids = sorted(coco.getImgIds()))? Is it because the annotation info from dataset = build_dataset(cfg.data.test) is arranged by img_id? |
To make the order of the result consistent with the order of ground truth used the evaluation code, I think you also need to add one similar line: |
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
] There is no MultiScaleFlipAug in training_pipeline. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.141
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.242
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.137
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.088
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.138
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.200
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.188
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.298
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.317
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.185
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.306
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.451 Raw AP(0.5:0.95)=0.359 |
No, I didn't change these configurations. Given your results, I guess there might be something wrong in some step of your experiment. |
Maybe. But I only make some small changes like below. in anchor_head.py
def loss(self,
cls_scores,
bbox_preds,
gt_bboxes,
gt_labels,
img_metas,
gt_bboxes_ignore=None,
**kwargs):
self.save_bboxes_batch(cls_scores, bbox_preds, img_metas)
...
#retinaNet上限实验
def save_bboxes_batch(self, cls_scores, bbox_preds, img_metas):
results_list = self.get_bboxes(cls_scores, bbox_preds, img_metas, rescale=True)
from mmdet.core import bbox2result
#每个类的结果
bbox_results = [
bbox2result(det_bboxes, det_labels, self.num_classes)
for det_bboxes, det_labels in results_list
]
img_name = img_metas[0]['filename'].split('/')[-1][:-4]
save_name = '/gpfs/home/sist/tqzouustc/code/mmdetection/retinanet_analysis_wo_change_noflip/' + img_name + '.pt'
import os
if not os.path.exists(os.path.dirname(save_name)):
os.mkdir(os.path.dirname(save_name))
import pickle
with open(save_name, 'wb') as f:
pickle.dump(bbox_results, f) I don't change cls_scores and bbox_preds. And the self.get_bboxes function is the original function in anchor_head.py. I don't change anything of it. Also I don't change other parts of loss() function. So what is wrong in my pipelines? I get stucked here for several days. It makes me feel really upset... |
I can't figure out exactly what the wrong step is in your code, but I have a wild guess that you should not use the |
@Icecream-blue-sky, sorry, it suddenly came to my mind that you need to change this line: |
Thanks! |
You are right! It seems that the bug have been fixed!!!! Thanks for your kind help!!! |
Great to hear that. My pleasure. |
Hi, thanks for your work and repo. I'm very interested in the
VFL
, which combines classification scores and location scores in the targets. Then I have some questions aboutVFL
.table 3
of the paper, the first row represents the results of the raw VFNet trained with the focal loss. What israw VFNet
?Is it
FCOS+ATSS with the centerness branch removed
?VFL
toFCOS+ATSS with the centerness branch removed
and applyingFL
toFCOS+ATSS(with the centerness branch)
?Thank you very much!
The text was updated successfully, but these errors were encountered: