-
Notifications
You must be signed in to change notification settings - Fork 45
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
77feabb
commit 71bcc43
Showing
65 changed files
with
9,057 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
env/ | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*,cover | ||
.hypothesis/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# IPython Notebook | ||
.ipynb_checkpoints | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# dotenv | ||
.env | ||
|
||
# virtualenv | ||
venv/ | ||
ENV/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# atom remote-sync package | ||
.remote-sync.json | ||
|
||
# weights | ||
weights/ | ||
|
||
#DS_Store | ||
.DS_Store | ||
|
||
# dev stuff | ||
eval/ | ||
eval.ipynb | ||
dev.ipynb | ||
.vscode/ | ||
|
||
# not ready | ||
videos/ | ||
templates/ | ||
data/ssd_dataloader.py | ||
data/datasets/ | ||
doc/visualize.py | ||
read_results.py | ||
ssd300_120000/ | ||
demos/live | ||
webdemo.py | ||
test_data_aug.py | ||
|
||
# attributes | ||
|
||
# pycharm | ||
.idea/ | ||
|
||
# temp checkout soln | ||
data/datasets/ | ||
data/ssd_dataloader.py | ||
|
||
# pylint | ||
.pylintrc | ||
|
||
# ssd.pytorch master branch (for merging) | ||
ssd.pytorch/ | ||
|
||
# some datasets | ||
data/VOCdevkit/ | ||
data/coco/images/ | ||
data/coco/annotations/ | ||
ap_data.pkl | ||
results/ | ||
logs/ | ||
scripts/aws/ | ||
scripts/gt.npy | ||
scripts/proto.npy | ||
scripts/info.txt | ||
test.pkl | ||
testeval.py | ||
scripts/aws2/ | ||
status.sh | ||
train.sh | ||
img/ | ||
scripts/aws-ohio/ | ||
scripts/aws3/ | ||
data/config_dev.py | ||
data/coco/ | ||
data/sbd/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# YOLACT Change Log | ||
|
||
This document will detail all changes I make. | ||
I don't know how I'm going to be versioning things yet, so you get dates for now. | ||
|
||
``` | ||
2019.06.27 | ||
- Sped up save video by ~8 ms per frame. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2019 Daniel Bolya | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
# **Y**ou **O**nly **L**ook **A**t **C**oefficien**T**s | ||
``` | ||
██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗ | ||
╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝ | ||
╚████╔╝ ██║ ██║██║ ███████║██║ ██║ | ||
╚██╔╝ ██║ ██║██║ ██╔══██║██║ ██║ | ||
██║ ╚██████╔╝███████╗██║ ██║╚██████╗ ██║ | ||
╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ | ||
``` | ||
|
||
A simple, fully convolutional model for real-time instance segmentation. This is the code for [our paper](https://arxiv.org/abs/1904.02689), and for the forseeable future is still in development. | ||
|
||
Here's a look at our current results for our base model (33 fps on a Titan Xp and 29.8 mAP on COCO's `test-dev`): | ||
|
||
 | ||
|
||
 | ||
|
||
 | ||
|
||
# Installation | ||
- Set up a Python3 environment. | ||
- Install [Pytorch](http://pytorch.org/) 1.0.1 (or higher) and TorchVision. | ||
- Install some other packages: | ||
```Shell | ||
# Cython needs to be installed before pycocotools | ||
pip install cython | ||
pip install opencv-python pillow pycocotools matplotlib | ||
``` | ||
- Clone this repository and enter it: | ||
```Shell | ||
git clone https://github.com/dbolya/yolact.git | ||
cd yolact | ||
``` | ||
- If you'd like to train YOLACT, download the COCO dataset and the 2014/2017 annotations. Note that this script will take a while and dump 21gb of files into `./data/coco`. | ||
```Shell | ||
sh data/scripts/COCO.sh | ||
``` | ||
- If you'd like to evaluate YOLACT on `test-dev`, download `test-dev` with this script. | ||
```Shell | ||
sh data/scripts/COCO_test.sh | ||
``` | ||
|
||
|
||
# Evaluation | ||
As of April 5th, 2019 here are our latest models along with their FPS on a Titan Xp and mAP on `test-dev`: | ||
|
||
| Image Size | Backbone | FPS | mAP | Weights | | | ||
|:----------:|:-------------:|:----:|:----:|----------------------------------------------------------------------------------------------------------------------|--------| | ||
| 550 | Resnet50-FPN | 42.5 | 28.2 | [yolact_resnet50_54_800000.pth](https://drive.google.com/file/d/1yp7ZbbDwvMiFJEq4ptVKTYTI2VeRDXl0/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EUVpxoSXaqNIlssoLKOEoCcB1m0RpzGq_Khp5n1VX3zcUw) | | ||
| 550 | Darknet53-FPN | 40.0 | 28.7 | [yolact_darknet53_54_800000.pth](https://drive.google.com/file/d/1dukLrTzZQEuhzitGkHaGjphlmRJOjVnP/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/ERrao26c8llJn25dIyZPhwMBxUp2GdZTKIMUQA3t0djHLw) | ||
| 550 | Resnet101-FPN | 33.0 | 29.8 | [yolact_base_54_800000.pth](https://drive.google.com/file/d/1UYy3dMapbH1BnmtZU4WH1zbYgOzzHHf_/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EYRWxBEoKU9DiblrWx2M89MBGFkVVB_drlRd_v5sdT3Hgg) | ||
| 700 | Resnet101-FPN | 23.6 | 31.2 | [yolact_im700_54_800000.pth](https://drive.google.com/file/d/1lE4Lz5p25teiXV-6HdTiOJSnS7u7GBzg/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/Eagg5RSc5hFEhp7sPtvLNyoBjhlf2feog7t8OQzHKKphjw) | ||
|
||
To evalute the model, put the corresponding weights file in the `./weights` directory and run one of the following commands. | ||
## Quantitative Results on COCO | ||
```Shell | ||
# Quantitatively evaluate a trained model on the entire validation set. Make sure you have COCO downloaded as above. | ||
# This should get 29.92 validation mask mAP last time I checked. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth | ||
|
||
# Output a COCOEval json to submit to the website or to use the run_coco_eval.py script. | ||
# This command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json | ||
|
||
# You can run COCOEval on the files created in the previous command. The performance should match my implementation in eval.py. | ||
python run_coco_eval.py | ||
|
||
# To output a coco json file for test-dev, make sure you have test-dev downloaded from above and go | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json --dataset=coco2017_testdev_dataset | ||
``` | ||
## Qualitative Results on COCO | ||
```Shell | ||
# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --display | ||
``` | ||
## Benchmarking on COCO | ||
```Shell | ||
# Run just the raw model on the first 1k images of the validation set | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --benchmark --max_images=1000 | ||
``` | ||
## Images | ||
```Shell | ||
# Display qualitative results on the specified image. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png | ||
|
||
# Process an image and save it to another file. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png | ||
|
||
# Process a whole folder of images. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder | ||
``` | ||
## Video | ||
```Shell | ||
# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=my_video.mp4 | ||
|
||
# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=0 | ||
|
||
# Process a video and save it to another file. This is unoptimized. | ||
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4 | ||
``` | ||
As you can tell, `eval.py` can do a ton of stuff. Run the `--help` command to see everything it can do. | ||
```Shell | ||
python eval.py --help | ||
``` | ||
|
||
|
||
# Training | ||
By default, we Train on COCO. Make sure to download the entire dataset using the commands above. | ||
- To train, grab an imagenet-pretrained model and put it in `./weights`. | ||
- For Resnet101, download `resnet101_reducedfc.pth` from [here](https://drive.google.com/file/d/1tvqFPd4bJtakOlmn-uIA492g2qurRChj/view?usp=sharing). | ||
- For Resnet50, download `resnet50-19c8e357.pth` from [here](https://drive.google.com/file/d/1Jy3yCdbatgXa5YYIdTCRrSV0S9V5g1rn/view?usp=sharing). | ||
- For Darknet53, download `darknet53.pth` from [here](https://drive.google.com/file/d/17Y431j4sagFpSReuPNoFcj9h7azDTZFf/view?usp=sharing). | ||
- Run one of the training commands below. | ||
- Note that you can press ctrl+c while training and it will save an `*_interrupt.pth` file at the current iteration. | ||
- All weights are saved in the `./weights` directory by default with the file name `<config>_<epoch>_<iter>.pth`. | ||
```Shell | ||
# Trains using the base config with a batch size of 8 (the default). | ||
python train.py --config=yolact_base_config | ||
|
||
# Trains yolact_base_config with a batch_size of 5. For the 550px models, 1 batch takes up around 1.5 gigs of VRAM, so specify accordingly. | ||
python train.py --config=yolact_base_config --batch_size=5 | ||
|
||
# Resume training yolact_base with a specific weight file and start from the iteration specified in the weight file's name. | ||
python train.py --config=yolact_base_config --resume=weights/yolact_base_10_32100.pth --start_iter=-1 | ||
|
||
# Use the help option to see a description of all available command line arguments | ||
python train.py --help | ||
``` | ||
|
||
## Custom Datasets | ||
You can also train on your own dataset by following these steps: | ||
- Create a COCO-style Object Detection JSON annotation file for your dataset. The specification for this can be found [here](http://cocodataset.org/#format-data). Note that we don't use some fields, so the following may be omitted: | ||
- `info` | ||
- `liscense` | ||
- Under `image`: `license, flickr_url, coco_url, date_captured` | ||
- `categories` (we use our own format for categories, see below) | ||
- Create a definition for your dataset under `dataset_base` in `data/config.py` (see the comments in `dataset_base` for an explanation of each field): | ||
```Python | ||
my_custom_dataset = dataset_base.copy({ | ||
'name': 'My Dataset', | ||
|
||
'train_images': 'path_to_training_images', | ||
'train_info': 'path_to_training_annotation', | ||
|
||
'valid_images': 'path_to_validation_images', | ||
'valid_info': 'path_to_validation_annotation', | ||
|
||
'has_gt': True, | ||
'class_names': ('my_class_id_1', 'my_class_id_2', 'my_class_id_3', ...) | ||
}) | ||
``` | ||
- A couple things to note: | ||
- Class IDs in the annotation file should start at 1 and increase sequentially on the order of `class_names`. If this isn't the case for your annotation file (like in COCO), see the field `label_map` in `dataset_base`. | ||
- If you do not want to create a validation split, use the same image path and annotations file for validation. By default (see `python train.py --help`), `train.py` will output validation mAP for the first 5000 images in the dataset every 2 epochs. | ||
- Finally, in `yolact_base_config` in the same file, change the value for `'dataset'` to `'my_custom_dataset'` or whatever you named the config object above. Then you can use any of the training commands in the previous section. | ||
|
||
#### Creating a Custom Dataset from Scratch | ||
See [this nice post by @Amit12690](https://github.com/dbolya/yolact/issues/70#issuecomment-504283008) for tips on how to annotate a custom dataset and prepare it for use with YOLACT. | ||
|
||
|
||
|
||
|
||
# Citation | ||
If you use YOLACT or this code base in your work, please cite | ||
``` | ||
@article{bolya-arxiv2019, | ||
author = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee}, | ||
title = {YOLACT: {Real-time} Instance Segmentation}, | ||
journal = {arXiv}, | ||
year = {2019}, | ||
} | ||
``` | ||
|
||
|
||
|
||
# Contact | ||
For questions about our paper or code, please contact [Daniel Bolya](mailto:[email protected]). |
Oops, something went wrong.