The code is built on top of 3D-RetinaNet for ROAD.
The first task requires developing models for scenarios where only little annotated data is available at training time. More precisely, only 3 out of 15 videos (from the training partition train_1 of the ROAD-R dataset) are used for training the models in this task.
The videos' ids are: 2014-07-14-14-49-50_stereo_centre_01, 2015-02-03-19-43-11_stereo_centre_04, and 2015-02-24-12-32-19_stereo_centre_04.
By solely using the three specified videos, without any data augmentation, complex post-processing, or TTA, our developed TBSD model achieves a frame map@50 score of 0.262 for task-1. When combined with the only-dinov2 branch through ensemble (with only_dinov2_branch), the final frame map@50 score reaches around 0.27 for task-1.
Please refer to the "environment" folder in the directory, where you can choose the .yml
file for building.
conda env create -f environment.yml
conda activate base
or:
pip install -r requirements.txt
The road
directory should look like this:
road/
- road_trainval_v1.0.json
- videos/
- 2014-06-25-16-45-34_stereo_centre_02
- 2014-06-26-09-53-12_stereo_centre_02
- ........
- rgb-images
- 2014-06-25-16-45-34_stereo_centre_02/
- 00001.jpg
- 00002.jpg
- .........*.jpg
- 2014-06-26-09-53-12_stereo_centre_02
- 00001.jpg
- 00002.jpg
- .........*.jpg
- ......../
- ........*.jpg
And you need to place the directory for configuring the dataset in the parent level of this file directory, i.e., the parent level of the directory where the README.md file is located. Please refer to road-dataset for the specific format.
Please place the pre-trained models in the /pretrainmodel
folder. You can obtain the pre-trained models from the link provided below.
Model | Link |
---|---|
swin_base_patch244_window1677_sthv2.pth (optional) | swin-base-ssv2 |
swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth | swin-large-k700 |
yolox_l.pth | yolox-l |
vit-giant-p14_dinov2-pre_3rdparty_20230426-2934a630.pth | dinov2-giant |
vit-large-p14_dinov2-pre_3rdparty_20230426-f3302d9e.pth | dinov2-large |
pretrained weight for head | pretrained weight for head |
Note: You may need to run get_kinetics_weights.sh
(included in the ROAD-R Challenge ) to obtain the file named resnet50RCGRU.pth. Otherwise, you may encounter an error.
To train the model, provide the following positional arguments:
DATA_ROOT
: path to a directory in whichroad
can be found, containingroad_test_v1.0.json
,road_trainval_v1.0.json
, and directoriesrgb-images
andvideos
.SAVE_ROOT
: path to a directory in which the experiments (e.g. checkpoints, training logs) will be saved.MODEL_PATH
: path to the directory containing the weights for the chosen backbone (e.g.resnet50RCGRU.pth
).
The remaining experimental details and logs can be found in actual_task1_logs_TBSD
and actual_task1_logs_only_dinov2
. The folder all_history_logs
in the main directory contains all the experimental information for tasks one and two.
Example train command (to be run from the root of this repository):
python main.py --TASK=1 --DATA_ROOT="yourpath/road-dataset-master/" --pretrained_model_path="/root/autodl-tmp/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth" --pretrained_model_path2="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/pretrained_weights_task1.pth" --MODEL_PATH="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/kinetics-pt/" --SAVE_ROOT="yourpath/road-dataset-master/SAVE/" --MODE="train" --LOGIC="Lukasiewicz" --VAL_STEP=1 --LR=6e-5 --MAX_EPOCHS=25
Below is an example command to test a model.
CUDA_VISIBLE_DEVICES=1 python main.py --RESUME=20 --TASK=1 --LOGIC="Lukasiewicz" --EXPDIR="yourpath/road-dataset-master/experiments/" --DATA_ROOT="yourpath/road-dataset-master/" --pretrained_model_path="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/swin-large-p244-w877_in22k-pre_16xb8-amp-32x2x1-30e_kinetics700-rgb_20220930-f8d74db7.pth" --pretrained_model_path2="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/pretrainmodel/pretrained_weights_task1.pth" --MODEL_PATH="yourpath/road-dataset-master/ROAD-R-2023-Challenge-main_me/kinetics-pt/" --SAVE_ROOT="yourpath/road-dataset-master/SAVE/" --MODE="gen_dets" --TEST_SUBSETS=test --EVAL_EPOCHS=20 --EXP_NAME="yourpath/road-dataset-master/SAVE/road/logic-ssl_cache_Lukasiewicz_8.0/resnet50RCGRU512-Pkinetics-b8s12x1x1-roadt1-h3x3x3-10-23-09-28-54x/"
There are readme instructions in the environment
folder and ensemble
folder that you may need to read in order to run the project more effectively.
[1] road-dataset
[5] dinov2
[6] YOLOX
[7] mmpretrain
[8] mmaction2