diff --git a/projects/README.md b/projects/README.md index 1f54114b..6415ebec 100644 --- a/projects/README.md +++ b/projects/README.md @@ -18,4 +18,4 @@ Here are projects that are built on detrex which show you use detrex as a librar - [Align-DETR: Improving DETR with Simple IoU-aware BCE loss (ArXiv'2023)](./align_detr/) - [EVA-01: Exploring the Limits of Masked Visual Representation Learning at Scale (CVPR'2023 Highlight)](./dino_eva/) - [EVA-02: A Visual Representation for Neon Genesis (ArXiv'2023)](./dino_eva/) -- [Less is More: Focus Attention for Efficient DETR](./focus_detr/) +- [Less is More: Focus Attention for Efficient DETR (ICCV'2023)](./focus_detr/) diff --git a/projects/focus_detr/README.md b/projects/focus_detr/README.md index 0435e7de..0cf8fcde 100644 --- a/projects/focus_detr/README.md +++ b/projects/focus_detr/README.md @@ -3,14 +3,25 @@ This is the official implementation of the ICCV 2023 paper "Less is More: Focus Authors: Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang. -[[`arXiv (coming soon)`]()] [[`Official Implementation`](https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR)] +[[`arXiv`](https://arxiv.org/abs/2307.12612)] [[`Official Implementation`](https://github.com/linxid/Focus-DETR)] [[`BibTeX`](#citing-focus-detr)] + + +Focus-DETR is a model that focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Compared with the state-of-the-art sparse transformed-based detector under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO. + + +## Model Architecture + +Our Focus-DETR comprises a backbone network, a Transformer encoder, and a Transformer decoder. We design a foreground token selector (FTS) based on top-down score modulations across multi-scale features. And the selected tokens by a multi-category score predictor and foreground tokens go through the Pyramid Encoder to remedy the limitation of deformable attention in distant information mixing. +

+ ## Table of Contents - [Focus-DETR](#focus-detr) +- [Model Architecture](#model-architecture) - [Table of Contents](#table-of-contents) - [Main Results with Pretrained Models](#main-results-with-pretrained-models) - [Pretrained focus\_detr with ResNet Backbone](#pretrained-focus_detr-with-resnet-backbone) @@ -18,6 +29,7 @@ Authors: Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang. - [Installation](#installation) - [Training](#training) - [Evaluation](#evaluation) +- [Citation](#citing-focus-detr) ## Main Results with Pretrained Models @@ -180,3 +192,37 @@ cd detrex python tools/train_net.py --config-file projects/focus_detr/configs/path/to/config.py --eval-only train.init_checkpoint=/path/to/model_checkpoint ``` - Note that you should download the pretrained model from [Pretrained Weights](#main-results-with-pretrained-models) and `unzip` it to the specific folder then update the `train.init_checkpoint` to the path of pretrained weights. + + +### Result + +```bash +Results of Focus-DETR with Resnet50 backbone: +IoU metric: bbox + Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.479 + Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.659 + Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.521 + Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.323 + Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.505 + Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.619 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.372 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.640 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.720 + Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.568 + Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.757 + Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.878 +``` + +## Citing Focus-DETR +If you find our work helpful for your research, please consider citing the following BibTeX entry. + +```BibTex +@misc{zheng2023more, + title={Less is More: Focus Attention for Efficient DETR}, + author={Dehua Zheng and Wenhui Dong and Hailin Hu and Xinghao Chen and Yunhe Wang}, + year={2023}, + eprint={2307.12612}, + archivePrefix={arXiv}, + primaryClass={cs.CV} +} +```