add yolact

GothicAi · Aug 23, 2019 · 71bcc43 · 71bcc43
1 parent 77feabb
commit 71bcc43
Show file tree

Hide file tree

Showing 65 changed files with 9,057 additions and 0 deletions.
diff --git a/yolact/.gitignore b/yolact/.gitignore
@@ -0,0 +1,154 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*,cover
+.hypothesis/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# IPython Notebook
+.ipynb_checkpoints
+
+# pyenv
+.python-version
+
+# celery beat schedule file
+celerybeat-schedule
+
+# dotenv
+.env
+
+# virtualenv
+venv/
+ENV/
+
+# Spyder project settings
+.spyderproject
+
+# Rope project settings
+.ropeproject
+
+# atom remote-sync package
+.remote-sync.json
+
+# weights
+weights/
+
+#DS_Store
+.DS_Store
+
+# dev stuff
+eval/
+eval.ipynb
+dev.ipynb
+.vscode/
+
+# not ready
+videos/
+templates/
+data/ssd_dataloader.py
+data/datasets/
+doc/visualize.py
+read_results.py
+ssd300_120000/
+demos/live
+webdemo.py
+test_data_aug.py
+
+# attributes
+
+# pycharm
+.idea/
+
+# temp checkout soln
+data/datasets/
+data/ssd_dataloader.py
+
+# pylint
+.pylintrc
+
+# ssd.pytorch master branch (for merging)
+ssd.pytorch/
+
+# some datasets
+data/VOCdevkit/
+data/coco/images/
+data/coco/annotations/
+ap_data.pkl
+results/
+logs/
+scripts/aws/
+scripts/gt.npy
+scripts/proto.npy
+scripts/info.txt
+test.pkl
+testeval.py
+scripts/aws2/
+status.sh
+train.sh
+img/
+scripts/aws-ohio/
+scripts/aws3/
+data/config_dev.py
+data/coco/
+data/sbd/
diff --git a/yolact/CHANGELOG.md b/yolact/CHANGELOG.md
@@ -0,0 +1,9 @@
+# YOLACT Change Log
+
+This document will detail all changes I make.
+I don't know how I'm going to be versioning things yet, so you get dates for now.
+
+```
+2019.06.27
+  - Sped up save video by ~8 ms per frame.
+```
diff --git a/yolact/LICENSE b/yolact/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2019 Daniel Bolya
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/yolact/ORIGINAL_README.md b/yolact/ORIGINAL_README.md
@@ -0,0 +1,180 @@
+# **Y**ou **O**nly **L**ook **A**t **C**oefficien**T**s
+```
+    ██╗   ██╗ ██████╗ ██╗      █████╗  ██████╗████████╗
+    ╚██╗ ██╔╝██╔═══██╗██║     ██╔══██╗██╔════╝╚══██╔══╝
+     ╚████╔╝ ██║   ██║██║     ███████║██║        ██║   
+      ╚██╔╝  ██║   ██║██║     ██╔══██║██║        ██║   
+       ██║   ╚██████╔╝███████╗██║  ██║╚██████╗   ██║   
+       ╚═╝    ╚═════╝ ╚══════╝╚═╝  ╚═╝ ╚═════╝   ╚═╝ 
+```
+
+A simple, fully convolutional model for real-time instance segmentation. This is the code for [our paper](https://arxiv.org/abs/1904.02689), and for the forseeable future is still in development.
+
+Here's a look at our current results for our base model (33 fps on a Titan Xp and 29.8 mAP on COCO's `test-dev`):
+
+![Example 0](data/yolact_example_0.png)
+
+![Example 1](data/yolact_example_1.png)
+
+![Example 2](data/yolact_example_2.png)
+
+# Installation
+ - Set up a Python3 environment.
+ - Install [Pytorch](http://pytorch.org/) 1.0.1 (or higher) and TorchVision.
+ - Install some other packages:
+   ```Shell
+   # Cython needs to be installed before pycocotools
+   pip install cython
+   pip install opencv-python pillow pycocotools matplotlib 
+   ```
+ - Clone this repository and enter it:
+   ```Shell
+   git clone https://github.com/dbolya/yolact.git
+   cd yolact
+   ```
+ - If you'd like to train YOLACT, download the COCO dataset and the 2014/2017 annotations. Note that this script will take a while and dump 21gb of files into `./data/coco`.
+   ```Shell
+   sh data/scripts/COCO.sh
+   ```
+ - If you'd like to evaluate YOLACT on `test-dev`, download `test-dev` with this script.
+   ```Shell
+   sh data/scripts/COCO_test.sh
+   ```
+
+
+# Evaluation
+As of April 5th, 2019 here are our latest models along with their FPS on a Titan Xp and mAP on `test-dev`:
+
+| Image Size | Backbone      | FPS  | mAP  | Weights                                                                                                              |  |
+|:----------:|:-------------:|:----:|:----:|----------------------------------------------------------------------------------------------------------------------|--------|
+| 550        | Resnet50-FPN  | 42.5 | 28.2 | [yolact_resnet50_54_800000.pth](https://drive.google.com/file/d/1yp7ZbbDwvMiFJEq4ptVKTYTI2VeRDXl0/view?usp=sharing)  | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EUVpxoSXaqNIlssoLKOEoCcB1m0RpzGq_Khp5n1VX3zcUw) |
+| 550        | Darknet53-FPN | 40.0 | 28.7 | [yolact_darknet53_54_800000.pth](https://drive.google.com/file/d/1dukLrTzZQEuhzitGkHaGjphlmRJOjVnP/view?usp=sharing) | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/ERrao26c8llJn25dIyZPhwMBxUp2GdZTKIMUQA3t0djHLw)
+| 550        | Resnet101-FPN | 33.0 | 29.8 | [yolact_base_54_800000.pth](https://drive.google.com/file/d/1UYy3dMapbH1BnmtZU4WH1zbYgOzzHHf_/view?usp=sharing)      | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/EYRWxBEoKU9DiblrWx2M89MBGFkVVB_drlRd_v5sdT3Hgg)
+| 700        | Resnet101-FPN | 23.6 | 31.2 | [yolact_im700_54_800000.pth](https://drive.google.com/file/d/1lE4Lz5p25teiXV-6HdTiOJSnS7u7GBzg/view?usp=sharing)     | [Mirror](https://ucdavis365-my.sharepoint.com/:u:/g/personal/yongjaelee_ucdavis_edu/Eagg5RSc5hFEhp7sPtvLNyoBjhlf2feog7t8OQzHKKphjw)
+
+To evalute the model, put the corresponding weights file in the `./weights` directory and run one of the following commands.
+## Quantitative Results on COCO
+```Shell
+# Quantitatively evaluate a trained model on the entire validation set. Make sure you have COCO downloaded as above.
+# This should get 29.92 validation mask mAP last time I checked.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth
+
+# Output a COCOEval json to submit to the website or to use the run_coco_eval.py script.
+# This command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json
+
+# You can run COCOEval on the files created in the previous command. The performance should match my implementation in eval.py.
+python run_coco_eval.py
+
+# To output a coco json file for test-dev, make sure you have test-dev downloaded from above and go
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json --dataset=coco2017_testdev_dataset
+```
+## Qualitative Results on COCO
+```Shell
+# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --display
+```
+## Benchmarking on COCO
+```Shell
+# Run just the raw model on the first 1k images of the validation set
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --benchmark --max_images=1000
+```
+## Images
+```Shell
+# Display qualitative results on the specified image.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png
+
+# Process an image and save it to another file.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png
+
+# Process a whole folder of images.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder
+```
+## Video
+```Shell
+# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=my_video.mp4
+
+# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=0
+
+# Process a video and save it to another file. This is unoptimized.
+python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4
+```
+As you can tell, `eval.py` can do a ton of stuff. Run the `--help` command to see everything it can do.
+```Shell
+python eval.py --help
+```
+
+
+# Training
+By default, we Train on COCO. Make sure to download the entire dataset using the commands above.
+ - To train, grab an imagenet-pretrained model and put it in `./weights`.
+   - For Resnet101, download `resnet101_reducedfc.pth` from [here](https://drive.google.com/file/d/1tvqFPd4bJtakOlmn-uIA492g2qurRChj/view?usp=sharing).
+   - For Resnet50, download `resnet50-19c8e357.pth` from [here](https://drive.google.com/file/d/1Jy3yCdbatgXa5YYIdTCRrSV0S9V5g1rn/view?usp=sharing).
+   - For Darknet53, download `darknet53.pth` from [here](https://drive.google.com/file/d/17Y431j4sagFpSReuPNoFcj9h7azDTZFf/view?usp=sharing).
+ - Run one of the training commands below.
+   - Note that you can press ctrl+c while training and it will save an `*_interrupt.pth` file at the current iteration.
+   - All weights are saved in the `./weights` directory by default with the file name `<config>_<epoch>_<iter>.pth`.
+```Shell
+# Trains using the base config with a batch size of 8 (the default).
+python train.py --config=yolact_base_config
+
+# Trains yolact_base_config with a batch_size of 5. For the 550px models, 1 batch takes up around 1.5 gigs of VRAM, so specify accordingly.
+python train.py --config=yolact_base_config --batch_size=5
+
+# Resume training yolact_base with a specific weight file and start from the iteration specified in the weight file's name.
+python train.py --config=yolact_base_config --resume=weights/yolact_base_10_32100.pth --start_iter=-1
+
+# Use the help option to see a description of all available command line arguments
+python train.py --help
+```
+
+## Custom Datasets
+You can also train on your own dataset by following these steps:
+ - Create a COCO-style Object Detection JSON annotation file for your dataset. The specification for this can be found [here](http://cocodataset.org/#format-data). Note that we don't use some fields, so the following may be omitted:
+   - `info`
+   - `liscense`
+   - Under `image`: `license, flickr_url, coco_url, date_captured`
+   - `categories` (we use our own format for categories, see below)
+ - Create a definition for your dataset under `dataset_base` in `data/config.py` (see the comments in `dataset_base` for an explanation of each field):
+```Python
+my_custom_dataset = dataset_base.copy({
+    'name': 'My Dataset',
+
+    'train_images': 'path_to_training_images',
+    'train_info':   'path_to_training_annotation',
+
+    'valid_images': 'path_to_validation_images',
+    'valid_info':   'path_to_validation_annotation',
+
+    'has_gt': True,
+    'class_names': ('my_class_id_1', 'my_class_id_2', 'my_class_id_3', ...)
+})
+```
+ - A couple things to note:
+   - Class IDs in the annotation file should start at 1 and increase sequentially on the order of `class_names`. If this isn't the case for your annotation file (like in COCO), see the field `label_map` in `dataset_base`.
+   - If you do not want to create a validation split, use the same image path and annotations file for validation. By default (see `python train.py --help`), `train.py` will output validation mAP for the first 5000 images in the dataset every 2 epochs.
+ - Finally, in `yolact_base_config` in the same file, change the value for `'dataset'` to `'my_custom_dataset'` or whatever you named the config object above. Then you can use any of the training commands in the previous section.
+
+#### Creating a Custom Dataset from Scratch
+See [this nice post by @Amit12690](https://github.com/dbolya/yolact/issues/70#issuecomment-504283008) for tips on how to annotate a custom dataset and prepare it for use with YOLACT.
+
+
+
+
+# Citation
+If you use YOLACT or this code base in your work, please cite
+```
+@article{bolya-arxiv2019,
+  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
+  title     = {YOLACT: {Real-time} Instance Segmentation},
+  journal   = {arXiv},
+  year      = {2019},
+}
+```
+
+
+
+# Contact
+For questions about our paper or code, please contact [Daniel Bolya](mailto:[email protected]).