This repository containts a real time 3D depth estmiation using stereo camera on KITTI Benchmark
- Open your anaconda terminal
- Create new conda enviroment with python 3.8.5 by running this command
conda create --name obj_det python=3.8.5
- Activate your enviroment
conda activate obj_det
- Install dependencies following this commands
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge
pip install PyQt5 vtk tqdm matplotlib==3.3.3 easydict==1.9 tensorboard
pip install mayavi
conda install scikit-image shapely
conda install -c conda-forge opencv
- Then Navigate to
Models/AnyNet/models/spn_t1
to activate SPN layer. - For windows, Open git bash terminal, Activate the enviroment then run
sh make.sh
- For Linux , Open the terminal, Activate the enviroment then run
./make.sh
You need to make data directory first and construct dataset folder as following :
Stereo-3D-Detection
├── checkpoints
├── data
│ ├── <dataset folder>
│ │ │── training
│ │ │ ├──calib & velodyne & label_2 & image_2 & image_3
│ │ │── testing
├── Models
├── utils_classes
├── .
├── .
You can download all checkpoints from this Drive
Stereo-3D-Detection
├── checkpoints
│ ├── anynet.tar
│ ├── sfa.pth
├── data
├── .
- To go from stereo to 3D object detection
python demo.py
- Choose a mode:
- Regular mode, Add no option -> You can navigate between images by pressing any key, and to exit press ESC
- To evaluate the pipeline, Add
--evaluate
- To generate a video, Be sure that you have adjusted the path in
demo.py
, Then add--generate_video
- To generate the video with bev view, Add
--with_bev
- To generate the video with bev view, Add
- To Spasify How often to print time durations In case of video, Add
--print_freq <no>
- Data path is set to
data/kitti
by default, To change it add--data_path <datapath>
- Anynet checkpoint path is set to
checkpoints/anynet.tar
by default, To change it add--pretrained_anynet <checkpoint path>
- SFA checkpoint path is set to
checkpoints/sfa.pth
by default, To change it add--pretrained_sfa <checkpoint path>
You have to organize your own dataset as the following format
Stereo-3D-Detection
├── checkpoints
├── data
│ ├── <dataset>
│ │ │── training
│ │ │ ├──disp_occ_0 & image_2 & image_3
├── .
├── .
Incase of .npy
Disparities:
Stereo-3D-Detection
├── checkpoints
├── data
│ ├── <dataset>
│ │ │── training
│ │ │ ├──disp_occ_0_npy & image_2 & image_3
├── .
├── .
Command:
python train_anynet.py --maxdisp <default: 192> \
--datatype <2012/2015/other> \
--data_path <datapath> \
--save_path <default: 'results/train_anynet'> \
--pretrained_path <pretrained checkpoint path> \
--train_file <train file path if exist> \
--validation_file <validation file path> \
--with_spn <Activates Anynet last layer [RECOMMENDED]>
- If disparity files are .npy format, Add
--load_npy
- If you want to evaluate your pretrained checkpoint without training, Add
--evaluate
- In case of datatype 2012/2015, Add
--split_file
- In case of training datatype of other, and want to train on specefic file names
--train_file
- In case of testing datatype of other, and want to validate/test on specefic file names
--validation_file
- If you want to start from specefic index, you can use this flag
--index <no>
python train_anynet.py --maxdisp 192 \
--datatype other \
--data_path data/kitti/ \
--pretrained_path checkpoints/anynet.tar \
--train_file data/kitti/imagesets/train.txt \
--validation_file data/kitti/imagesets/val.txt \
--with_spn --load_npy
python train_anynet.py --maxdisp 192 \
--datatype 2015 \
--save_path results/kitti2015 \
--data_path data/path-to-kitti2015/training/ \
--pretrained_path checkpoints/anynet.tar \
--split_file data/path-to-kitti2015/split.txt \
--with_spn
python train_anynet.py --maxdisp 192 \
--datatype 2012 \
--save_path results/kitti2012 \
--data_path data/path-to-kitti2012/training/ \
--pretrained_path checkpoints/anynet.tar \
--split_file data/path-to-kitti2012/split.txt \
--with_spn
You have to organize your own dataset as the following format
Stereo-3D-Detection
├── checkpoints
├── data
│ ├── <dataset>
│ │ │── ImageSets
│ │ │ ├── train.txt & test.txt & val.txt
│ │ │── training
│ │ │ ├── velodyne & calib & label_2
├── .
├── .
Command:
python train_sfa.py
- By default data path is set to 'data/kitti', To change it use
--data_path <datapath>
- By default pretrained path is set to 'checkpoints/sfa.pth', To change it use
--pretrained_path <pretrained checkpoint path>
- By default the name used for saved files is set to 'fpn_resnet_18', To change it use
--saved_fn <name>
- By default the batch size is set to 2, To change it use
--batch_size <no>
- You can adjust how often to print/save checkpoint/ Tensorboard freq through these flags
--print_freq <no>
--checkpoint_freq <no>
--tensorboard_freq <no>
- If you want to evaluate your pretrained checkpoint without training, Add
--evaluate
NOTE: The text files in ImageSets are split files, you can find the split files of Kitti object dataset here
To evaluate the model on testing data on KITTI submition
python demo.py --testing --save_objects objects.pkl
python submit_to_kitti.py
Then compress the label_2 folder in testing directory and submit it on KITTI
To generate disparity from point cloud, Be sure your folder structure is like this:
├── checkpoints
├── data
│ ├── <dataset>
│ │ │── training
│ │ │ ├── image_2 & velodyne & calib
├── .
├── .
Then run this command:
python ./tools/generate_disp.py --datapath <datapath>
- There is
--limit <no>
flag if you dont to limit who much of the dataset you want to convert - Data path is set to
data/kitti/training
by default, To change it add--data_path <datapath>
NOTE: When specifiying your data path make it relative to Stereo-3D-Detection directory
This will generate 2 disaprity folders at the data path location generated_disp/disp_occ_0
and generated_disp/disp_occ_0_npy
, you can use any. But we recommend to use .npy
files
To generate point cloud from disparity/depth, Be sure your folder structure is like this:
├── checkpoints
├── data
│ ├── <dataset>
│ │ │── training
│ │ │ ├── disp_occ_0 & calib
├── .
├── .
Then run this command:
python ./tools/generate_lidar.py --datapath <datapath>
- If your converting depth images, use this flag
--is_depth
- Data path is set to
data/kitti/training
by default, To change it add--data_path <datapath>
- There is
--limit <no>
flag if you dont to limit who much of the dataset you want to convert NOTE: When specifiying your data path make it relative to Stereo-3D-Detection directory
This will generate a velodyne folder at the data path location generated_lidar/velodyne
To View a point cloud file .bin
, You can use View_bin.py file in tools folder. Just copy it in point cloud folder, then run:
python view_bin.py
- By default, it will show you image
000000.bin
, but you can specify the image you want by using--image <image no>
flag
We added another way to track how long each function take and how frequent it have been called. you can see this by running :
sh profiling.sh
Then you will find your results in new generated file profiling.txt
.
@article{wang2018anytime,
title={Anytime Stereo Image Depth Estimation on Mobile Devices},
author={Wang, Yan and Lai, Zihang and Huang, Gao and Wang, Brian H. and Van Der Maaten, Laurens and Campbell, Mark and Weinberger, Kilian Q},
journal={arXiv preprint arXiv:1810.11408},
year={2018}
}
@misc{Super-Fast-Accurate-3D-Object-Detection-PyTorch,
author = {Nguyen Mau Dung},
title = {{Super-Fast-Accurate-3D-Object-Detection-PyTorch}},
howpublished = {\url{https://github.com/maudzung/Super-Fast-Accurate-3D-Object-Detection}},
year = {2020}
}