Skip to content

Latest commit

 

History

History
95 lines (60 loc) · 5.95 KB

README.md

File metadata and controls

95 lines (60 loc) · 5.95 KB

Differencing based Self-supervised pretraining for scene change detection (DSP)

This is the official code for COLLA 2022 Paper, "Differencing based Self-supervised pretraining for scene change detection" by Vijaya Raghavan Thiruvengadathan Ramkumar, Elahe Arani and Bahram Zonooz, where we propose a novel self-supervised pretraining architechture based on differenceing called DSP for scene change detection.

Abstract

Scene change detection (SCD), a crucial perception task, identifies changes by comparing scenes captured at different times. SCD is challenging due to noisy changes in illumination, seasonal variations, and perspective differences across a pair of views. Deep neural network based solutions require a large quantity of annotated data which is tedious and expensive to obtain. On the other hand, transfer learning from large datasets induces domain shift. To address these challenges, we propose a novel Differencing self-supervised pretraining (DSP) method that uses feature differencing to learn discriminatory representations corresponding to the changed regions while simultaneously tackling the noisy changes by enforcing temporal invariance across views. Our experimental results on SCD datasets demonstrate the effectiveness of our method, specifically to differences in camera viewpoints and lighting conditions. Compared against the self-supervised Barlow Twins and the standard ImageNet pretraining that uses more than a million additional labeled images, DSP can surpass it without using any additional data. Our results also demonstrate the robustness of DSP to natural corruptions, distribution shift, and learning under limited labeled data.

alt text

For more details, please see the Paper and Presentation.

Requirements

  • python 3.6+
  • opencv 3.4.2+
  • pytorch 1.6.0
  • torchvision 0.4.0+
  • tqdm 4.51.0
  • tensorboardX 2.1

Datasets

Our network is tested on two datasets for street-view scene change detection.

Dataset Preprocessing

Dataset folder structure for VL-CMU-CD:

├── VL-CMU-CD
│   ├── Image_T0
│   ├── Image_T1
│   ├── Ground Truth

SSL Training

  • For training 'DSP' on VL-CMU-CD dataset:
python3 DSP/train.py --ssl_batchsize 16 --ssl_epochs 500 --save_dir /outputs --data_dir /path/to/VL-CMU-CD --img_size 256 --n_proj 256 --hidden_layer 512 --output_stride 8 --pre_train False --m_backbone False --barlow_twins True --dense_cl False --kd_loss True --kd_loss_2 sp --inter_kl False --alpha_inter_kd 0 --alpha_sp 3000 --alpha_kl 100

Fine Tuning

We evaluate Rand, Imagenet supervised, Barlow twins, and DSP pretraining on DR-TANet.

Start training with DR-TANet on 'VL-CMU-CD' dataset.

python3 train.py --dataset vl_cmu_cd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 150 --batch-size 16 --encoder-arch resnet50 --epoch-save 25 --drtam --refinement

Start evaluating with DR-TANet on 'PCD' dataset.

python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet50 --drtam --refinement --store-imgs

Evaluating the finetuned model

Start evaluating with DR-TANet on 'PCD' dataset.

python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet18 --drtam --refinement --store-imgs

Analysis

We analyse our DSP model under 3 scenarios: 1. Robustness to Natural corruptions 2. Out-of-distribution data 3. Limited labeled data. For more details, please see the Paper. For Natural corruptions evaluation, please refer to the paper {Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations }

And finally, for the ease of comparison, we have provided the model checkpoints for the DSP pretraining below: google drive

Cite our work

If you find the code useful in your research, please consider citing our paper:

@inproceedings{ramkumar2022differencing,
  title={Differencing based Self-supervised pretraining for Scene Change Detection},
  author={Ramkumar, Vijaya Raghavan T and Arani, Elahe and Zonooz, Bahram},
  booktitle={Conference on Lifelong Learning Agents},
  pages={952--965},
  year={2022},
  organization={PMLR}
}