Skip to content

Fast Cross-Domain Unsupervised Object detection through Online Style Transfer

License

Notifications You must be signed in to change notification settings

MazinOnsa/Cross-Domain-Unsupervised-Object-Detection

Repository files navigation

Unsupervised Domain Adaptation (UDA) for object detection applications

Pascal VOC to Clipart1k using SSD one shot detector

We present a framework for real-time Unsupervised Domain Adaptation (UDA) for object detection. We start from a fully supervised SSD (Single Shot MultiBox Detector) trained on a source domain (e.g., natural image) composed of instance-level annotated images and progressively adapt the detector using unsupervised images from a target domain (e.g., artwork). Our framework performs fine-tuning without previously translated samples, achieving a fast and versatile domain adaptation. We also improve the mean average precision (mAP) compared to other domain translation methods.

framework

Implementation

Task Choise Implementation
OD SSD lufficc
Style transfer AdaIN irasin
Source Domain natural PASCAL VOC
Target Domain artistic Clipart1k
.

Example SSD output (vgg_ssd300_voc0712).

Example AdaIN online translation (gg_ssd300_voc0712_variation).

SSD Installation

Requirements

  1. Python3
  2. PyTorch 1.0 or higher
  3. yacs
  4. Vizer
  5. GCC >= 4.9
  6. OpenCV

Step-by-step installation

git clone https://github.com/lufficc/SSD.git
cd SSD
pip install -r requirements.txt

Train

Setting Up Datasets

For Pascal VOC source dataset and Clipart1k target dataset, make the folder structure like this:

datasets
|__ VOC2007
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ VOC2012
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ ...
|
|__ clipart
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets

Training with AdaIN online translation

See vgg_ssd300_voc0712_variationVx examples

You can find an example code on this Colab project

Single GPU training

# for example, train SSD300:
python train.py --config-file configs/vgg_ssd300_voc0712.yaml

Multi-GPU training

# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_ssd300_voc0712.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000

The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour.

Evaluate

Single GPU evaluating

# for example, evaluate SSD300:
python test.py --config-file configs/vgg_ssd300_voc0712.yaml

Multi-GPU evaluating

# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml

Results summary

  • the proposed framework leverages the accuracy of the baseline FSD by approximately 10 to 12 percentage points in terms of mAP
  • In comparison against the best performing unsupervised domain mapping algorithms in the cross domain adaptive detection, our framework outperforms these algorithms by 4 to 5 percentage points
  • In addition to the best performances related to mAP, we stated a relevant reduction in terms of DT time.

The code is available at