diff --git a/.github/CODE_OF_CONDUCT.md b/.github/CODE_OF_CONDUCT.md
index efd4305798..92afad1c5a 100644
--- a/.github/CODE_OF_CONDUCT.md
+++ b/.github/CODE_OF_CONDUCT.md
@@ -14,22 +14,22 @@ appearance, race, religion, or sexual identity and orientation.
Examples of behavior that contributes to creating a positive environment
include:
-* Using welcoming and inclusive language
-* Being respectful of differing viewpoints and experiences
-* Gracefully accepting constructive criticism
-* Focusing on what is best for the community
-* Showing empathy towards other community members
+- Using welcoming and inclusive language
+- Being respectful of differing viewpoints and experiences
+- Gracefully accepting constructive criticism
+- Focusing on what is best for the community
+- Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
-* The use of sexualized language or imagery and unwelcome sexual attention or
- advances
-* Trolling, insulting/derogatory comments, and personal or political attacks
-* Public or private harassment
-* Publishing others' private information, such as a physical or electronic
- address, without explicit permission
-* Other conduct which could reasonably be considered inappropriate in a
- professional setting
+- The use of sexualized language or imagery and unwelcome sexual attention or
+ advances
+- Trolling, insulting/derogatory comments, and personal or political attacks
+- Public or private harassment
+- Publishing others' private information, such as a physical or electronic
+ address, without explicit permission
+- Other conduct which could reasonably be considered inappropriate in a
+ professional setting
## Our Responsibilities
@@ -70,7 +70,7 @@ members of the project's leadership.
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
-[homepage]: https://www.contributor-covenant.org
-
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
+
+[homepage]: https://www.contributor-covenant.org
diff --git a/.github/ISSUE_TEMPLATE/error-report.md b/.github/ISSUE_TEMPLATE/error-report.md
index 703a7d44b7..ee65286207 100644
--- a/.github/ISSUE_TEMPLATE/error-report.md
+++ b/.github/ISSUE_TEMPLATE/error-report.md
@@ -4,12 +4,12 @@ about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
-
---
Thanks for your error report and we appreciate it a lot.
**Checklist**
+
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
@@ -17,6 +17,7 @@ Thanks for your error report and we appreciate it a lot.
A clear and concise description of what the bug is.
**Reproduction**
+
1. What command or script did you run?
```
@@ -30,8 +31,8 @@ A placeholder for the command.
1. Please run `python mmdet3d/utils/collect_env.py` to collect necessary environment information and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- - How you installed PyTorch [e.g., pip, conda, source]
- - Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
+ - How you installed PyTorch \[e.g., pip, conda, source\]
+ - Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Error traceback**
If applicable, paste the error trackback here.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
index 33f9d5f235..7bf92e8c91 100644
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -4,15 +4,14 @@ about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
-
---
**Describe the feature**
**Motivation**
A clear and concise description of the motivation of the feature.
-Ex1. It is inconvenient when [....].
-Ex2. There is a recent paper [....], which is very helpful for [....].
+Ex1. It is inconvenient when \[....\].
+Ex2. There is a recent paper \[....\], which is very helpful for \[....\].
**Related resources**
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.
diff --git a/.github/ISSUE_TEMPLATE/general_questions.md b/.github/ISSUE_TEMPLATE/general_questions.md
index b5a6451a6c..f02dd63a80 100644
--- a/.github/ISSUE_TEMPLATE/general_questions.md
+++ b/.github/ISSUE_TEMPLATE/general_questions.md
@@ -4,5 +4,4 @@ about: Ask general questions to get help
title: ''
labels: ''
assignees: ''
-
---
diff --git a/.github/ISSUE_TEMPLATE/reimplementation_questions.md b/.github/ISSUE_TEMPLATE/reimplementation_questions.md
index 9c17fb6794..68637b6436 100644
--- a/.github/ISSUE_TEMPLATE/reimplementation_questions.md
+++ b/.github/ISSUE_TEMPLATE/reimplementation_questions.md
@@ -2,25 +2,27 @@
name: Reimplementation Questions
about: Ask about questions during model reimplementation
title: ''
-labels: 'reimplementation'
+labels: reimplementation
assignees: ''
-
---
**Notice**
There are several common situations in the reimplementation issues as below
+
1. Reimplement a model in the model zoo using the provided configs
2. Reimplement a model in the model zoo on other dataset (e.g., custom datasets)
3. Reimplement a custom model but all the components are implemented in MMDetection3D
4. Reimplement a custom model with new modules implemented by yourself
There are several things to do for different cases as below.
+
- For case 1 & 3, please follow the steps in the following sections thus we could help to quick identify the issue.
- For case 2 & 4, please understand that we are not able to do much help here because we usually do not know the full code and the users should be responsible to the code they write.
- One suggestion for case 2 & 4 is that the users should first check whether the bug lies in the self-implemted code or the original code. For example, users can first make sure that the same model runs well on supported datasets. If you still need help, please describe what you have done and what you obtain in the issue, and follow the steps in the following sections and try as clear as possible so that we can better help you.
**Checklist**
+
1. I have searched related issues but cannot get the expected help.
2. The issue has not been fixed in the latest version.
@@ -29,6 +31,7 @@ There are several things to do for different cases as below.
A clear and concise description of what the problem you meet and what have you done.
**Reproduction**
+
1. What command or script did you run?
```
@@ -48,8 +51,8 @@ A placeholder for the config.
1. Please run `python mmdet3d/utils/collect_env.py` to collect necessary environment information and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- - How you installed PyTorch [e.g., pip, conda, source]
- - Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
+ - How you installed PyTorch \[e.g., pip, conda, source\]
+ - Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Results**
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
index d130aef7d9..3668d833ba 100644
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -1,16 +1,20 @@
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
## Motivation
+
Please describe the motivation of this PR and the goal you want to achieve through this PR.
## Modification
+
Please briefly describe what modification is made in this PR.
## BC-breaking (Optional)
+
Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
## Use cases (Optional)
+
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
## Checklist
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 5bd88c5a27..790bfb1f74 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,5 +1,5 @@
repos:
- - repo: https://gitlab.com/pycqa/flake8.git
+ - repo: https://github.com/PyCQA/flake8
rev: 3.8.3
hooks:
- id: flake8
@@ -24,16 +24,19 @@ repos:
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- - repo: https://github.com/markdownlint/markdownlint
- rev: v0.11.0
- hooks:
- - id: markdownlint
- args: ["-r", "~MD002,~MD013,~MD029,~MD033,~MD034",
- "-t", "allow_different_nesting"]
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
+ - repo: https://github.com/executablebooks/mdformat
+ rev: 0.7.14
+ hooks:
+ - id: mdformat
+ args: [ "--number" ]
+ additional_dependencies:
+ - mdformat-gfm
+ - mdformat_frontmatter
+ - linkify-it-py
- repo: https://github.com/myint/docformatter
rev: v1.3.1
hooks:
diff --git a/README.md b/README.md
index f50e3b34c9..9fc9127476 100644
--- a/README.md
+++ b/README.md
@@ -24,7 +24,6 @@
[](https://codecov.io/gh/open-mmlab/mmdetection3d)
[](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE)
-
**News**: We released the codebase v1.0.0rc2.
Note: We are going through large refactoring to provide simpler and more unified usage of many modules.
@@ -69,13 +68,13 @@ a part of the OpenMMLab project developed by [MMLab](http://mmlab.ie.cuhk.edu.hk
It trains faster than other codebases. The main results are as below. Details can be found in [benchmark.md](./docs/en/benchmarks.md). We compare the number of samples trained per second (the higher, the better). The models that are not supported by other codebases are marked by `×`.
- | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) |[votenet](https://github.com/facebookresearch/votenet)| [Det3D](https://github.com/poodarchu/Det3D) |
- |:-------:|:-------------:|:---------:|:-----:|:-----:|
- | VoteNet | 358 | × | 77 | × |
- | PointPillars-car| 141 | × | × | 140 |
- | PointPillars-3class| 107 |44 | × | × |
- | SECOND| 40 |30 | × | × |
- | Part-A2| 17 |14 | × | × |
+ | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) |
+ | :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: |
+ | VoteNet | 358 | × | 77 | × |
+ | PointPillars-car | 141 | × | × | 140 |
+ | PointPillars-3class | 107 | 44 | × | × |
+ | SECOND | 40 | 30 | × | × |
+ | Part-A2 | 17 | 14 | × | × |
Like [MMDetection](https://github.com/open-mmlab/mmdetection) and [MMCV](https://github.com/open-mmlab/mmcv), MMDetection3D can also be used as a library to support different projects on top of it.
@@ -214,28 +213,28 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
-| | ResNet | ResNeXt | SENet |PointNet++ |DGCNN | HRNet | RegNetX | Res2Net | DLA |
-|--------------------|:--------:|:--------:|:--------:|:---------:|:---------:|:-----:|:--------:|:-----:|:---:|
-| SECOND | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| PointPillars | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| FreeAnchor | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| H3DNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| 3DSSD | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| MVXNet | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| CenterPoint | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| SSN | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| ImVoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| FCOS3D | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗
-| PointNet++ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| Group-Free-3D | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| ImVoxelNet | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
-| PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗
-| SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓
-| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗
-| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓
+| | ResNet | ResNeXt | SENet | PointNet++ | DGCNN | HRNet | RegNetX | Res2Net | DLA |
+| ------------- | :----: | :-----: | :---: | :--------: | :---: | :---: | :-----: | :-----: | :-: |
+| SECOND | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| PointPillars | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| FreeAnchor | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| H3DNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| 3DSSD | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| MVXNet | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| CenterPoint | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| SSN | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| ImVoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| FCOS3D | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
+| PointNet++ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| Group-Free-3D | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| ImVoxelNet | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ |
+| SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
+| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
+| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
**Note:** All the about **300+ models, methods of 40+ papers** in 2D detection supported by [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/model_zoo.md) can be trained or used in this codebase.
@@ -252,6 +251,7 @@ Please refer to [FAQ](docs/en/faq.md) for frequently asked questions. When updat
## Model deployment
Now MMDeploy has supported some MMDetection3D model deployment. Please refer to [model_deployment.md](docs/en/tutorials/model_deployment.md) for more details.
+
## Citation
If you find this project useful in your research, please consider cite:
diff --git a/README_zh-CN.md b/README_zh-CN.md
index c23867d13a..7fbc95beb8 100644
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -24,7 +24,6 @@
[](https://codecov.io/gh/open-mmlab/mmdetection3d)
[](https://github.com/open-mmlab/mmdetection3d/blob/master/LICENSE)
-
**新闻**: 我们发布了版本 v1.0.0rc2.
说明:我们正在进行大规模的重构,以提供对许多模块更简单、更统一的使用。
@@ -63,19 +62,19 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
- **与 2D 检测器的自然整合**
- [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/model_zoo.md) 支持的**300+个模型 , 40+的论文算法**, 和相关模块都可以在此代码库中训练或使用。
+ [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/model_zoo.md) 支持的**300+个模型 , 40+的论文算法**, 和相关模块都可以在此代码库中训练或使用。
- **性能高**
- 训练速度比其他代码库更快。下表可见主要的对比结果。更多的细节可见[基准测评文档](./docs/zh_cn/benchmarks.md)。我们对比了每秒训练的样本数(值越高越好)。其他代码库不支持的模型被标记为 `×`。
+ 训练速度比其他代码库更快。下表可见主要的对比结果。更多的细节可见[基准测评文档](./docs/zh_cn/benchmarks.md)。我们对比了每秒训练的样本数(值越高越好)。其他代码库不支持的模型被标记为 `×`。
- | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) |[votenet](https://github.com/facebookresearch/votenet)| [Det3D](https://github.com/poodarchu/Det3D) |
- |:-------:|:-------------:|:---------:|:-----:|:-----:|
- | VoteNet | 358 | × | 77 | × |
- | PointPillars-car| 141 | × | × | 140 |
- | PointPillars-3class| 107 |44 | × | × |
- | SECOND| 40 |30 | × | × |
- | Part-A2| 17 |14 | × | × |
+ | Methods | MMDetection3D | [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) | [votenet](https://github.com/facebookresearch/votenet) | [Det3D](https://github.com/poodarchu/Det3D) |
+ | :-----------------: | :-----------: | :--------------------------------------------------: | :----------------------------------------------------: | :-----------------------------------------: |
+ | VoteNet | 358 | × | 77 | × |
+ | PointPillars-car | 141 | × | × | 140 |
+ | PointPillars-3class | 107 | 44 | × | × |
+ | SECOND | 40 | 30 | × | × |
+ | Part-A2 | 17 | 14 | × | × |
和 [MMDetection](https://github.com/open-mmlab/mmdetection),[MMCV](https://github.com/open-mmlab/mmcv) 一样, MMDetection3D 也可以作为一个库去支持各式各样的项目.
@@ -214,29 +213,28 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代
-| | ResNet | ResNeXt | SENet |PointNet++ |DGCNN | HRNet | RegNetX | Res2Net | DLA |
-|--------------------|:--------:|:--------:|:--------:|:---------:|:---------:|:-----:|:--------:|:-----:|:---:|
-| SECOND | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| PointPillars | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| FreeAnchor | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| H3DNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| 3DSSD | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| MVXNet | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| CenterPoint | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| SSN | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗
-| ImVoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| FCOS3D | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗
-| PointNet++ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| Group-Free-3D | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| ImVoxelNet | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗
-| PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗
-| DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗
-| SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓
-| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗
-| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓
-
+| | ResNet | ResNeXt | SENet | PointNet++ | DGCNN | HRNet | RegNetX | Res2Net | DLA |
+| ------------- | :----: | :-----: | :---: | :--------: | :---: | :---: | :-----: | :-----: | :-: |
+| SECOND | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| PointPillars | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| FreeAnchor | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| H3DNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| 3DSSD | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| Part-A2 | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| MVXNet | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| CenterPoint | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| SSN | ☐ | ☐ | ☐ | ✗ | ✗ | ☐ | ✓ | ☐ | ✗ |
+| ImVoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| FCOS3D | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
+| PointNet++ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| Group-Free-3D | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| ImVoxelNet | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
+| DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ |
+| SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
+| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ |
+| MonoFlex | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
**注意:** [MMDetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/zh_cn/model_zoo.md) 支持的基于2D检测的**300+个模型 , 40+的论文算法**在 MMDetection3D 中都可以被训练或使用。
diff --git a/configs/3dssd/README.md b/configs/3dssd/README.md
index 8955288125..4feb6d7673 100644
--- a/configs/3dssd/README.md
+++ b/configs/3dssd/README.md
@@ -17,6 +17,7 @@ Currently, there have been many kinds of voxel-based 3D single stage detectors,
We implement 3DSSD and provide the results and checkpoints on KITTI datasets.
Some settings in our implementation are different from the [official implementation](https://github.com/Jia-Research-Lab/3DSSD), which bring marginal differences to the performance on KITTI datasets in our experiments. To simplify and unify the models of our implementation, we skip them in our models. These differences are listed as below:
+
1. We keep the scenes without any object while the official code skips these scenes in training. In the official implementation, only 3229 and 3394 samples are used as training and validation sets, respectively. In our implementation, we keep using 3712 and 3769 samples as training and validation sets, respectively, as those used for all the other models in our implementation on KITTI datasets.
2. We do not modify the decay of `batch normalization` during training.
3. While using [`DataBaseSampler`](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/pipelines/dbsampler.py#L80) for data augmentation, the official code uses road planes as reference to place the sampled objects while we do not.
@@ -26,11 +27,11 @@ Some settings in our implementation are different from the [official implementat
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP |Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet2SAMSG](./3dssd_4x4_kitti-3d-car.py)| Car |72e|4.7||78.58(81.27)1|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/3dssd/3dssd_4x4_kitti-3d-car/3dssd_4x4_kitti-3d-car_20210818_203828-b89c8fc4.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/3dssd/3dssd_4x4_kitti-3d-car/3dssd_4x4_kitti-3d-car_20210818_203828.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :-------------------------------------------: | :---: | :-----: | :------: | :------------: | :----------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet2SAMSG](./3dssd_4x4_kitti-3d-car.py) | Car | 72e | 4.7 | | 78.58(81.27)1 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/3dssd/3dssd_4x4_kitti-3d-car/3dssd_4x4_kitti-3d-car_20210818_203828-b89c8fc4.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/3dssd/3dssd_4x4_kitti-3d-car/3dssd_4x4_kitti-3d-car_20210818_203828.log.json) |
-[1]: We report two different 3D object detection performance here. 78.58mAP is evaluated by our evaluation code and 81.27mAP is evaluated by the official development kit (so as that used in the paper and official code of 3DSSD ). We found that the commonly used Python implementation of [`rotate_iou`](https://github.com/traveller59/second.pytorch/blob/e42e4a0e17262ab7d180ee96a0a36427f2c20a44/second/core/non_max_suppression/nms_gpu.py#L605) which is used in our KITTI dataset evaluation, is different from the official implementation in [KITTI benchmark](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d).
+\[1\]: We report two different 3D object detection performance here. 78.58mAP is evaluated by our evaluation code and 81.27mAP is evaluated by the official development kit (so as that used in the paper and official code of 3DSSD ). We found that the commonly used Python implementation of [`rotate_iou`](https://github.com/traveller59/second.pytorch/blob/e42e4a0e17262ab7d180ee96a0a36427f2c20a44/second/core/non_max_suppression/nms_gpu.py#L605) which is used in our KITTI dataset evaluation, is different from the official implementation in [KITTI benchmark](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d).
## Citation
diff --git a/configs/centerpoint/README.md b/configs/centerpoint/README.md
index 76016cb789..7f7545c2f5 100644
--- a/configs/centerpoint/README.md
+++ b/configs/centerpoint/README.md
@@ -108,23 +108,23 @@ data = dict(
### CenterPoint
-|Backbone| Voxel type (voxel size) |Dcn|Circular nms| Mem (GB) | Inf time (fps) | mAP |NDS| Download |
-| :---------: |:-----: |:-----: | :------: | :------------: | :----: |:----: | :------: |:------: |
-|[SECFPN](./centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus.py)|voxel (0.1)|✗|✓|4.9| |56.19|64.43|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201001_135205-5db91e00.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201001_135205.log.json)|
-|above w/o circle nms|voxel (0.1)|✗|✗| | |56.56|64.46||
-|[SECFPN](./centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus.py)|voxel (0.1)|✓|✓|5.2| |56.34|64.81|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20201004_075317-26d8176c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20201004_075317.log.json)|
-|above w/o circle nms|voxel (0.1)|✓|✗| | |56.60|64.90||
-|[SECFPN](./centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus.py)|voxel (0.075)|✗|✓|7.8| |57.34|65.23|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20200925_230905-358fbe3b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20200925_230905.log.json)|
-|above w/o circle nms|voxel (0.075)|✗|✗| | |57.63|65.39| |
-|[SECFPN](./centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus.py)|voxel (0.075)|✓|✓|8.5| |57.27|65.58|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20200930_201619-67c8496f.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20200930_201619.log.json)|
-|above w/o circle nms|voxel (0.075)|✓|✗| | |57.43|65.63||
-|above w/ double flip|voxel (0.075)|✓|✗| | |59.73|67.39||
-|above w/ scale tta|voxel (0.075)|✓|✗| | |60.43|67.65||
-|above w/ circle nms w/o scale tta|voxel (0.075)|✓|✗| | |59.52|67.24||
-|[SECFPN](./centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus.py)|pillar (0.2)|✗|✓|4.4| |49.07|59.66|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201004_170716-a134a233.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201004_170716.log.json)|
-|above w/o circle nms|pillar (0.2)|✗|✗| | |49.12|59.66||
-|[SECFPN](./centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus.py)|pillar (0.2)|✓|✗| 4.6| |48.8 |59.67 |[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus_20200930_103722-3bb135f2.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus_20200930_103722.log.json)|
-|above w/ circle nms|pillar (0.2)|✓|✓| | |48.79|59.65||
+| Backbone | Voxel type (voxel size) | Dcn | Circular nms | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :---------------------------------------------------------------------------------: | :---------------------: | :-: | :----------: | :------: | :------------: | :---: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus.py) | voxel (0.1) | ✗ | ✓ | 4.9 | | 56.19 | 64.43 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201001_135205-5db91e00.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201001_135205.log.json) |
+| above w/o circle nms | voxel (0.1) | ✗ | ✗ | | | 56.56 | 64.46 | |
+| [SECFPN](./centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus.py) | voxel (0.1) | ✓ | ✓ | 5.2 | | 56.34 | 64.81 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20201004_075317-26d8176c.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_01voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20201004_075317.log.json) |
+| above w/o circle nms | voxel (0.1) | ✓ | ✗ | | | 56.60 | 64.90 | |
+| [SECFPN](./centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus.py) | voxel (0.075) | ✗ | ✓ | 7.8 | | 57.34 | 65.23 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20200925_230905-358fbe3b.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_circlenms_4x8_cyclic_20e_nus_20200925_230905.log.json) |
+| above w/o circle nms | voxel (0.075) | ✗ | ✗ | | | 57.63 | 65.39 | |
+| [SECFPN](./centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus.py) | voxel (0.075) | ✓ | ✓ | 8.5 | | 57.27 | 65.58 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20200930_201619-67c8496f.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus/centerpoint_0075voxel_second_secfpn_dcn_circlenms_4x8_cyclic_20e_nus_20200930_201619.log.json) |
+| above w/o circle nms | voxel (0.075) | ✓ | ✗ | | | 57.43 | 65.63 | |
+| above w/ double flip | voxel (0.075) | ✓ | ✗ | | | 59.73 | 67.39 | |
+| above w/ scale tta | voxel (0.075) | ✓ | ✗ | | | 60.43 | 67.65 | |
+| above w/ circle nms w/o scale tta | voxel (0.075) | ✓ | ✗ | | | 59.52 | 67.24 | |
+| [SECFPN](./centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus.py) | pillar (0.2) | ✗ | ✓ | 4.4 | | 49.07 | 59.66 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201004_170716-a134a233.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_circlenms_4x8_cyclic_20e_nus_20201004_170716.log.json) |
+| above w/o circle nms | pillar (0.2) | ✗ | ✗ | | | 49.12 | 59.66 | |
+| [SECFPN](./centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus.py) | pillar (0.2) | ✓ | ✗ | 4.6 | | 48.8 | 59.67 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus_20200930_103722-3bb135f2.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/centerpoint/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus/centerpoint_02pillar_second_secfpn_dcn_4x8_cyclic_20e_nus_20200930_103722.log.json) |
+| above w/ circle nms | pillar (0.2) | ✓ | ✓ | | | 48.79 | 59.65 | |
## Citation
diff --git a/configs/dgcnn/README.md b/configs/dgcnn/README.md
index 20819f5b25..525543503a 100644
--- a/configs/dgcnn/README.md
+++ b/configs/dgcnn/README.md
@@ -22,22 +22,22 @@ We implement DGCNN and provide the results and checkpoints on S3DIS dataset.
### S3DIS
-| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
-| :-------------------------------------------------------------------------: | :----: | :--------: | :------: | :------------: | :------------: | :----------------------: |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_1 | cosine 100e | 13.1 | | 68.33 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area1/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_000734-39658f14.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area1/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_000734.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_2 | cosine 100e | 13.1 | | 40.68 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area2/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_144648-aea9ecb6.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area2/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_144648.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_3 | cosine 100e | 13.1 | | 69.38 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area3/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210801_154629-2ff50ee0.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area3/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210801_154629.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_4 | cosine 100e | 13.1 | | 50.07 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area4/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_073551-dffab9cd.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area4/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_073551.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_5 | cosine 100e | 13.1 | | 50.59 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area5/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210730_235824-f277e0c5.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area5/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210730_235824.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_6 | cosine 100e | 13.1 | | 77.94 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area6/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_154317-e3511b32.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area6/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_154317.log.json) |
-| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | 6-fold | | | | 59.43 | |
+| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
+| :-------------------------------------------------------: | :----: | :---------: | :------: | :------------: | :------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_1 | cosine 100e | 13.1 | | 68.33 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area1/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_000734-39658f14.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area1/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_000734.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_2 | cosine 100e | 13.1 | | 40.68 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area2/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_144648-aea9ecb6.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area2/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210731_144648.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_3 | cosine 100e | 13.1 | | 69.38 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area3/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210801_154629-2ff50ee0.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area3/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210801_154629.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_4 | cosine 100e | 13.1 | | 50.07 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area4/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_073551-dffab9cd.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area4/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_073551.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_5 | cosine 100e | 13.1 | | 50.59 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area5/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210730_235824-f277e0c5.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area5/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210730_235824.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | Area_6 | cosine 100e | 13.1 | | 77.94 | [model](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area6/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_154317-e3511b32.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.17.0_models/dgcnn/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class/area6/dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class_20210802_154317.log.json) |
+| [DGCNN](./dgcnn_32x4_cosine_100e_s3dis_seg-3d-13class.py) | 6-fold | | | | 59.43 | |
**Notes:**
-- We use XYZ+Color+Normalized_XYZ as input in all the experiments on S3DIS datasets.
-- `Area_5` Split means training the model on Area_1, 2, 3, 4, 6 and testing on Area_5.
-- `6-fold` Split means the overall result of 6 different splits (Area_1, Area_2, Area_3, Area_4, Area_5 and Area_6 Splits).
-- Users need to modify `train_area` and `test_area` in the S3DIS dataset's [config](./configs/_base_/datasets/s3dis_seg-3d-13class.py) to set the training and testing areas, respectively.
+- We use XYZ+Color+Normalized_XYZ as input in all the experiments on S3DIS datasets.
+- `Area_5` Split means training the model on Area_1, 2, 3, 4, 6 and testing on Area_5.
+- `6-fold` Split means the overall result of 6 different splits (Area_1, Area_2, Area_3, Area_4, Area_5 and Area_6 Splits).
+- Users need to modify `train_area` and `test_area` in the S3DIS dataset's [config](./configs/_base_/datasets/s3dis_seg-3d-13class.py) to set the training and testing areas, respectively.
## Indeterminism
diff --git a/configs/dynamic_voxelization/README.md b/configs/dynamic_voxelization/README.md
index da5fb6cd46..ab2bbc6988 100644
--- a/configs/dynamic_voxelization/README.md
+++ b/configs/dynamic_voxelization/README.md
@@ -20,11 +20,11 @@ We implement Dynamic Voxelization proposed in and provide its results and model
### KITTI
-| Model |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: |:-----: | :------: | :------------: | :----: | :------: |
-|[SECOND](./dv_second_secfpn_6x8_80e_kitti-3d-car.py)|Car |cyclic 80e|5.5||78.83|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_second_secfpn_6x8_80e_kitti-3d-car/dv_second_secfpn_6x8_80e_kitti-3d-car_20200620_235228-ac2c1c0c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_second_secfpn_6x8_80e_kitti-3d-car/dv_second_secfpn_6x8_80e_kitti-3d-car_20200620_235228.log.json)|
-|[SECOND](./dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class.py)| 3 Class|cosine 80e|5.5||65.27|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/dynamic_voxelization/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class_20210831_054106-e742d163.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/dynamic_voxelization/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class_20210831_054106.log.json)|
-|[PointPillars](./dv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py)| Car|cyclic 80e|4.7||77.76|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230844-ee7b75c9.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230844.log.json)|
+| Model | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :---------------------------------------------------------------: | :-----: | :--------: | :------: | :------------: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECOND](./dv_second_secfpn_6x8_80e_kitti-3d-car.py) | Car | cyclic 80e | 5.5 | | 78.83 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_second_secfpn_6x8_80e_kitti-3d-car/dv_second_secfpn_6x8_80e_kitti-3d-car_20200620_235228-ac2c1c0c.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_second_secfpn_6x8_80e_kitti-3d-car/dv_second_secfpn_6x8_80e_kitti-3d-car_20200620_235228.log.json) |
+| [SECOND](./dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class.py) | 3 Class | cosine 80e | 5.5 | | 65.27 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/dynamic_voxelization/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class_20210831_054106-e742d163.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/dynamic_voxelization/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class/dv_second_secfpn_2x8_cosine_80e_kitti-3d-3class_20210831_054106.log.json) |
+| [PointPillars](./dv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py) | Car | cyclic 80e | 4.7 | | 77.76 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230844-ee7b75c9.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/dynamic_voxelization/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car/dv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230844.log.json) |
## Citation
diff --git a/configs/fcos3d/README.md b/configs/fcos3d/README.md
index be517ec405..e47a489bc0 100644
--- a/configs/fcos3d/README.md
+++ b/configs/fcos3d/README.md
@@ -50,11 +50,11 @@ We also provide visualization functions to show the monocular 3D detection resul
### NuScenes
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[ResNet101 w/ DCN](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d.py)|1x|8.69||29.8|37.7|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210715_235813-4bed5239.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210715_235813.log.json)|
-|[above w/ finetune](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py)|1x|8.69||32.1|39.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210717_095645-8d806dc2.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210717_095645.log.json)|
-|above w/ tta|1x|8.69||33.1|40.3||
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :------------------------------------------------------------------------------------: | :-----: | :------: | :------------: | :--: | :--: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [ResNet101 w/ DCN](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d.py) | 1x | 8.69 | | 29.8 | 37.7 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210715_235813-4bed5239.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_20210715_235813.log.json) |
+| [above w/ finetune](./fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune.py) | 1x | 8.69 | | 32.1 | 39.5 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210717_095645-8d806dc2.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210717_095645.log.json) |
+| above w/ tta | 1x | 8.69 | | 33.1 | 40.3 | |
## Citation
diff --git a/configs/free_anchor/README.md b/configs/free_anchor/README.md
index 568495fc9a..727a700638 100644
--- a/configs/free_anchor/README.md
+++ b/configs/free_anchor/README.md
@@ -80,16 +80,16 @@ model = dict(
### PointPillars
-| Backbone |FreeAnchor|Lr schd | Mem (GB) | Inf time (fps) | mAP |NDS| Download |
-| :---------: |:-----: |:-----: | :------: | :------------: | :----: |:----: | :------: |
-|[FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py)|✗|2x|17.1||40.0|53.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405.log.json)|
-|[FPN](./hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x|16.3||43.82|54.86|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210816_163441-ae0897e7.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210816_163441.log.json)|
-|[RegNetX-400MF-FPN](../regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d.py)|✗|2x|17.3||44.8|56.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239-c694dce7.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239.log.json)|
-|[RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x|17.6||48.3|58.65|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_213939-a2dd3fff.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_213939.log.json)|
-|[RegNetX-1.6GF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x|24.3||52.04|61.49|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210828_025608-bfbd506e.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210828_025608.log.json)|
-|[RegNetX-1.6GF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d.py)*|✓|3x|24.4||52.69|62.45|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210827_184909-14d2dbd1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210827_184909.log.json)|
-|[RegNetX-3.2GF-FPN](./hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py)|✓|2x|29.4||52.4|61.94|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_181237-e385c35a.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_181237.log.json)|
-|[RegNetX-3.2GF-FPN](./hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d.py)*|✓|3x|29.2||54.23|63.41|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210828_030816-06708918.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210828_030816.log.json)|
+| Backbone | FreeAnchor | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :-------------------------------------------------------------------------------------------------------: | :--------: | :-----: | :------: | :------------: | :---: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py) | ✗ | 2x | 17.1 | | 40.0 | 53.3 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405.log.json) |
+| [FPN](./hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py) | ✓ | 2x | 16.3 | | 43.82 | 54.86 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210816_163441-ae0897e7.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210816_163441.log.json) |
+| [RegNetX-400MF-FPN](../regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d.py) | ✗ | 2x | 17.3 | | 44.8 | 56.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239-c694dce7.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239.log.json) |
+| [RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py) | ✓ | 2x | 17.6 | | 48.3 | 58.65 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_213939-a2dd3fff.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_213939.log.json) |
+| [RegNetX-1.6GF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py) | ✓ | 2x | 24.3 | | 52.04 | 61.49 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210828_025608-bfbd506e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210828_025608.log.json) |
+| [RegNetX-1.6GF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d.py)\* | ✓ | 3x | 24.4 | | 52.69 | 62.45 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210827_184909-14d2dbd1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210827_184909.log.json) |
+| [RegNetX-3.2GF-FPN](./hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py) | ✓ | 2x | 29.4 | | 52.4 | 61.94 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_181237-e385c35a.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_4x8_2x_nus-3d_20210827_181237.log.json) |
+| [RegNetX-3.2GF-FPN](./hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d.py)\* | ✓ | 3x | 29.2 | | 54.23 | 63.41 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210828_030816-06708918.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/free_anchor/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d/hv_pointpillars_regnet-3.2gf_fpn_sbn-all_free-anchor_strong-aug_4x8_3x_nus-3d_20210828_030816.log.json) |
**Note**: Models noted by `*` means it is trained using stronger augmentation with vertical flip under bird-eye-view, global translation, and larger range of global rotation.
diff --git a/configs/groupfree3d/README.md b/configs/groupfree3d/README.md
index 5c0a104b09..5b055e7e28 100644
--- a/configs/groupfree3d/README.md
+++ b/configs/groupfree3d/README.md
@@ -20,12 +20,12 @@ We implement Group-Free-3D and provide the result and checkpoints on ScanNet dat
### ScanNet
-| Method | Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :------: | :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [L6, O256](./groupfree3d_8x4_scannet-3d-18class-L6-O256.py ) | PointNet++ | 3x |6.7||66.32 (65.67*)|47.82 (47.74*)|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L6-O256/groupfree3d_8x4_scannet-3d-18class-L6-O256_20210702_145347-3499eb55.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L6-O256/groupfree3d_8x4_scannet-3d-18class-L6-O256_20210702_145347.log.json)|
-| [L12, O256](./groupfree3d_8x4_scannet-3d-18class-L12-O256.py ) | PointNet++ | 3x |9.4||66.57 (66.22*)|48.21 (48.95*)|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L12-O256/groupfree3d_8x4_scannet-3d-18class-L12-O256_20210702_150907-1c5551ad.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L12-O256/groupfree3d_8x4_scannet-3d-18class-L12-O256_20210702_150907.log.json)|
-| [L12, O256](./groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256.py ) | PointNet++w2x | 3x |13.3||68.20 (67.30*)|51.02 (50.44*)|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256_20210702_200301-944f0ac0.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256_20210702_200301.log.json)|
-| [L12, O512](./groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512.py ) | PointNet++w2x | 3x |18.8||68.22 (68.20*)|52.61 (51.31*)|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512_20210702_220204-187b71c7.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512_20210702_220204.log.json)|
+| Method | Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :---------------------------------------------------------------: | :-----------: | :-----: | :------: | :------------: | :-------------: | :-------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [L6, O256](./groupfree3d_8x4_scannet-3d-18class-L6-O256.py) | PointNet++ | 3x | 6.7 | | 66.32 (65.67\*) | 47.82 (47.74\*) | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L6-O256/groupfree3d_8x4_scannet-3d-18class-L6-O256_20210702_145347-3499eb55.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L6-O256/groupfree3d_8x4_scannet-3d-18class-L6-O256_20210702_145347.log.json) |
+| [L12, O256](./groupfree3d_8x4_scannet-3d-18class-L12-O256.py) | PointNet++ | 3x | 9.4 | | 66.57 (66.22\*) | 48.21 (48.95\*) | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L12-O256/groupfree3d_8x4_scannet-3d-18class-L12-O256_20210702_150907-1c5551ad.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-L12-O256/groupfree3d_8x4_scannet-3d-18class-L12-O256_20210702_150907.log.json) |
+| [L12, O256](./groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256.py) | PointNet++w2x | 3x | 13.3 | | 68.20 (67.30\*) | 51.02 (50.44\*) | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256_20210702_200301-944f0ac0.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O256_20210702_200301.log.json) |
+| [L12, O512](./groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512.py) | PointNet++w2x | 3x | 18.8 | | 68.22 (68.20\*) | 52.61 (51.31\*) | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512_20210702_220204-187b71c7.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/groupfree3d/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512/groupfree3d_8x4_scannet-3d-18class-w2x-L12-O512_20210702_220204.log.json) |
**Notes:**
diff --git a/configs/h3dnet/README.md b/configs/h3dnet/README.md
index 86173bfbdf..60cc30f3a2 100644
--- a/configs/h3dnet/README.md
+++ b/configs/h3dnet/README.md
@@ -20,11 +20,11 @@ We implement H3DNet and provide the result and checkpoints on ScanNet datasets.
### ScanNet
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [MultiBackbone](./h3dnet_3x8_scannet-3d-18class.py) | 3x |7.9||66.07|47.68|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/h3dnet/h3dnet_scannet-3d-18class/h3dnet_3x8_scannet-3d-18class_20210824_003149-414bd304.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/h3dnet/h3dnet_scannet-3d-18class/h3dnet_3x8_scannet-3d-18class_20210824_003149.log.json) |
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :-------------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [MultiBackbone](./h3dnet_3x8_scannet-3d-18class.py) | 3x | 7.9 | | 66.07 | 47.68 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/h3dnet/h3dnet_scannet-3d-18class/h3dnet_3x8_scannet-3d-18class_20210824_003149-414bd304.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/h3dnet/h3dnet_scannet-3d-18class/h3dnet_3x8_scannet-3d-18class_20210824_003149.log.json) |
-**Notice**: If your current mmdetection3d version >= 0.6.0, and you are using the checkpoints downloaded from the above links or using checkpoints trained with mmdetection3d version < 0.6.0, the checkpoints have to be first converted via [tools/model_converters/convert_h3dnet_checkpoints.py](../../tools/model_converters/convert_h3dnet_checkpoints.py):
+**Notice**: If your current mmdetection3d version >= 0.6.0, and you are using the checkpoints downloaded from the above links or using checkpoints trained with mmdetection3d version \< 0.6.0, the checkpoints have to be first converted via [tools/model_converters/convert_h3dnet_checkpoints.py](../../tools/model_converters/convert_h3dnet_checkpoints.py):
```
python ./tools/model_converters/convert_h3dnet_checkpoints.py ${ORIGINAL_CHECKPOINT_PATH} --out=${NEW_CHECKPOINT_PATH}
diff --git a/configs/imvotenet/README.md b/configs/imvotenet/README.md
index 5e0a821218..a491b9d82d 100644
--- a/configs/imvotenet/README.md
+++ b/configs/imvotenet/README.md
@@ -20,15 +20,15 @@ We implement ImVoteNet and provide the result and checkpoints on SUNRGBD.
### SUNRGBD-2D (Stage 1, image branch pre-train)
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet++](./imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class.py) | |2.1| ||62.70|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class_20210819_225618-62eba6ce.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class_20210819_225618.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :---------------------------------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++](./imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class.py) | | 2.1 | | | 62.70 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class_20210819_225618-62eba6ce.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class/imvotenet_faster_rcnn_r50_fpn_2x4_sunrgbd-3d-10class_20210819_225618.json) |
### SUNRGBD-3D (Stage 2)
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet++](./imvotenet_stage2_16x8_sunrgbd-3d-10class.py) | 3x |9.4| |64.55||[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_stage2_16x8_sunrgbd-3d-10class/imvotenet_stage2_16x8_sunrgbd-3d-10class_20210819_192851-1bcd1b97.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_stage2_16x8_sunrgbd-3d-10class/imvotenet_stage2_16x8_sunrgbd-3d-10class_20210819_192851.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :---------------------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++](./imvotenet_stage2_16x8_sunrgbd-3d-10class.py) | 3x | 9.4 | | 64.55 | | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_stage2_16x8_sunrgbd-3d-10class/imvotenet_stage2_16x8_sunrgbd-3d-10class_20210819_192851-1bcd1b97.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvotenet/imvotenet_stage2_16x8_sunrgbd-3d-10class/imvotenet_stage2_16x8_sunrgbd-3d-10class_20210819_192851.log.json) |
## Citation
diff --git a/configs/imvoxelnet/README.md b/configs/imvoxelnet/README.md
index 10231413dd..faaddf2942 100644
--- a/configs/imvoxelnet/README.md
+++ b/configs/imvoxelnet/README.md
@@ -22,9 +22,9 @@ Results for SUN RGB-D, ScanNet and nuScenes are currently available in ImVoxelNe
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: |:-----: | :------: | :------------: | :----: |:----: |
-| [ResNet-50](./imvoxelnet_kitti-3d-car.py) | Car | 3x | | |17.26|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvoxelnet/imvoxelnet_4x8_kitti-3d-car/imvoxelnet_4x8_kitti-3d-car_20210830_003014-3d0ffdf4.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvoxelnet/imvoxelnet_4x8_kitti-3d-car/imvoxelnet_4x8_kitti-3d-car_20210830_003014.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :---------------------------------------: | :---: | :-----: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [ResNet-50](./imvoxelnet_kitti-3d-car.py) | Car | 3x | | | 17.26 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvoxelnet/imvoxelnet_4x8_kitti-3d-car/imvoxelnet_4x8_kitti-3d-car_20210830_003014-3d0ffdf4.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/imvoxelnet/imvoxelnet_4x8_kitti-3d-car/imvoxelnet_4x8_kitti-3d-car_20210830_003014.log.json) |
## Citation
diff --git a/configs/monoflex/README.md b/configs/monoflex/README.md
index 938dcd12dd..0f402be24c 100644
--- a/configs/monoflex/README.md
+++ b/configs/monoflex/README.md
@@ -20,17 +20,17 @@ We implement MonoFlex and provide the results and checkpoints on KITTI dataset.
### KITTI
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: | :------: | :------------: | :----: | :------: |
-|[DLA34](./monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d.py)|6x|9.64||21.86|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/monoflex/monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d_20211228_027553-d46d9bb0.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/monoflex/monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d_20211228_027553.log.json)
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :---------------------------------------------------------------------: | :-----: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [DLA34](./monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d.py) | 6x | 9.64 | | 21.86 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/monoflex/monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d_20211228_027553-d46d9bb0.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/monoflex/monoflex_dla34_pytorch_dlaneck_gn-all_2x4_6x_kitti-mono3d_20211228_027553.log.json) |
Note: mAP represents Car moderate 3D strict AP11 results.
Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 and AP40 metric:
-| | Easy | Moderate | Hard |
-|-------------|:-------------:|:--------------:|:-------------:|
-| Car (AP11) | 28.02 / 36.11 | 21.86 / 29.46 | 19.01 / 24.83 |
-| Car (AP40) | 23.22 / 32.74 | 17.18 / 24.02 | 15.13 / 20.67 |
+| | Easy | Moderate | Hard |
+| ---------- | :-----------: | :-----------: | :-----------: |
+| Car (AP11) | 28.02 / 36.11 | 21.86 / 29.46 | 19.01 / 24.83 |
+| Car (AP40) | 23.22 / 32.74 | 17.18 / 24.02 | 15.13 / 20.67 |
Note: mAP represents Car moderate 3D strict AP11 / AP40 results. Because of the limited data for pedestrians and cyclists, the detection performance for these two classes is usually unstable. Therefore, we only list car detection results here. In addition, the AP11 result may fluctuate in a larger range (~1 AP), so AP40 is a more recommended metric for reference due to its much better stability.
diff --git a/configs/mvxnet/README.md b/configs/mvxnet/README.md
index 02c77203e3..d786efa779 100644
--- a/configs/mvxnet/README.md
+++ b/configs/mvxnet/README.md
@@ -20,9 +20,9 @@ We implement MVX-Net and provide its results and models on KITTI dataset.
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [SECFPN](./dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py)|3 Class|cosine 80e|6.7||63.22|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805-83442923.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :-------------------------------------------------------------------: | :-----: | :--------: | :------: | :------------: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py) | 3 Class | cosine 80e | 6.7 | | 63.22 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805-83442923.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805.log.json) |
## Citation
diff --git a/configs/nuimages/README.md b/configs/nuimages/README.md
index 42267f239d..9106229692 100644
--- a/configs/nuimages/README.md
+++ b/configs/nuimages/README.md
@@ -31,28 +31,29 @@ python -u tools/data_converter/nuimage_converter.py --data-root ${DATA_ROOT} --v
We report Mask R-CNN and Cascade Mask R-CNN results on nuimages.
-|Method | Backbone|Pretraining | Lr schd | Mem (GB) | Box AP | Mask AP |Download |
-| :---------: |:---------: | :---------: | :-----: |:-----: | :------: | :------------: | :----: |
-| Mask R-CNN| [R-50](./mask_rcnn_r50_fpn_1x_nuim.py) |IN|1x|7.4|47.8 |38.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_1x_nuim/mask_rcnn_r50_fpn_1x_nuim_20201008_195238-e99f5182.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_1x_nuim/mask_rcnn_r50_fpn_1x_nuim_20201008_195238.log.json)|
-| Mask R-CNN| [R-50](./mask_rcnn_r50_fpn_coco-2x_1x_nuim.py) |IN+COCO-2x|1x|7.4|49.7|40.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_coco-2x_1x_nuim/mask_rcnn_r50_fpn_coco-2x_1x_nuim_20201008_195238-b1742a60.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_coco-2x_1x_nuim/mask_rcnn_r50_fpn_coco-2x_1x_nuim_20201008_195238.log.json)|
-| Mask R-CNN| [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_1x_nuim.py) |IN|1x|7.0|47.7|38.2|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_1x_nuim/) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_1x_nuim/)|
-| Mask R-CNN| [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim.py) |IN+COCO-3x|1x|7.0|49.9|40.8|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim_20201008_195305-661a992e.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim_20201008_195305.log.json)|
-| Mask R-CNN| [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim.py) |IN+COCO-3x|20e|7.0|50.6|41.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim_20201009_125002-5529442c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim_20201009_125002.log.json)|
-| Mask R-CNN| [R-101](./mask_rcnn_r101_fpn_1x_nuim.py) |IN|1x|10.9|48.9|39.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r101_fpn_1x_nuim/mask_rcnn_r101_fpn_1x_nuim_20201024_134803-65c7623a.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r101_fpn_1x_nuim/mask_rcnn_r101_fpn_1x_nuim_20201024_134803.log.json)|
-| Mask R-CNN| [X-101_32x4d](./mask_rcnn_x101_32x4d_fpn_1x_nuim.py) |IN|1x|13.3|50.4|40.5|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_x101_32x4d_fpn_1x_nuim/mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135741-b699ab37.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_x101_32x4d_fpn_1x_nuim/mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135741.log.json)|
-| Cascade Mask R-CNN| [R-50](./cascade_mask_rcnn_r50_fpn_1x_nuim.py) |IN|1x|8.9|50.8|40.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_1x_nuim/cascade_mask_rcnn_r50_fpn_1x_nuim_20201008_195342-1147c036.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_1x_nuim/cascade_mask_rcnn_r50_fpn_1x_nuim_20201008_195342.log.json)|
-| Cascade Mask R-CNN| [R-50](./cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim.py) |IN+COCO-20e|1x|8.9|52.8|42.2|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim_20201009_124158-ad0540e3.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim_20201009_124158.log.json)|
-| Cascade Mask R-CNN| [R-50](./cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim.py) |IN+COCO-20e|20e|8.9|52.8|42.2|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim_20201009_124951-40963960.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim_20201009_124951.log.json)|
-| Cascade Mask R-CNN| [R-101](./cascade_mask_rcnn_r101_fpn_1x_nuim.py) |IN|1x|12.5|51.5|40.7|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r101_fpn_1x_nuim/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r101_fpn_1x_nuim/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804.log.json)|
-| Cascade Mask R-CNN| [X-101_32x4d](./cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim.py) |IN|1x|14.9|52.8|41.6|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135753-e0e49778.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135753.log.json)|
-| HTC w/o semantic|[R-50](./htc_without_semantic_r50_fpn_1x_nuim.py) |IN|1x||[model]() | [log]()|
-| HTC|[R-50](./htc_r50_fpn_1x_nuim.py) |IN|1x||[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/)|
-| HTC|[R-50](./htc_r50_fpn_coco-20e_1x_nuim.py) |IN+COCO-20e|1x|11.6|53.8|43.8|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_1x_nuim/htc_r50_fpn_coco-20e_1x_nuim_20201010_070203-0b53a65e.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_1x_nuim/htc_r50_fpn_coco-20e_1x_nuim_20201010_070203.log.json)|
-| HTC|[R-50](./htc_r50_fpn_coco-20e_20e_nuim.py) |IN+COCO-20e|20e|11.6|54.8|44.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_20e_nuim/htc_r50_fpn_coco-20e_20e_nuim_20201008_211415-d6c60a2c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_20e_nuim/htc_r50_fpn_coco-20e_20e_nuim_20201008_211415.log.json)|
-| HTC|[X-101_64x4d + DCN_c3-c5](./htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim.py) |IN+COCO-20e|20e|13.3|57.3|46.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim_20201008_211222-0b16ac4b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim_20201008_211222.log.json)|
+| Method | Backbone | Pretraining | Lr schd | Mem (GB) | Box AP | Mask AP | Download |
+| :----------------: | :-----------------------------------------------------------------------------------: | :---------: | :-----: | :------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-----: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| Mask R-CNN | [R-50](./mask_rcnn_r50_fpn_1x_nuim.py) | IN | 1x | 7.4 | 47.8 | 38.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_1x_nuim/mask_rcnn_r50_fpn_1x_nuim_20201008_195238-e99f5182.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_1x_nuim/mask_rcnn_r50_fpn_1x_nuim_20201008_195238.log.json) |
+| Mask R-CNN | [R-50](./mask_rcnn_r50_fpn_coco-2x_1x_nuim.py) | IN+COCO-2x | 1x | 7.4 | 49.7 | 40.5 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_coco-2x_1x_nuim/mask_rcnn_r50_fpn_coco-2x_1x_nuim_20201008_195238-b1742a60.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_fpn_coco-2x_1x_nuim/mask_rcnn_r50_fpn_coco-2x_1x_nuim_20201008_195238.log.json) |
+| Mask R-CNN | [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_1x_nuim.py) | IN | 1x | 7.0 | 47.7 | 38.2 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_1x_nuim/) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_1x_nuim/) |
+| Mask R-CNN | [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim.py) | IN+COCO-3x | 1x | 7.0 | 49.9 | 40.8 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim_20201008_195305-661a992e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_1x_nuim_20201008_195305.log.json) |
+| Mask R-CNN | [R-50-CAFFE](./mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim.py) | IN+COCO-3x | 20e | 7.0 | 50.6 | 41.3 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim_20201009_125002-5529442c.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim/mask_rcnn_r50_caffe_fpn_coco-3x_20e_nuim_20201009_125002.log.json) |
+| Mask R-CNN | [R-101](./mask_rcnn_r101_fpn_1x_nuim.py) | IN | 1x | 10.9 | 48.9 | 39.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r101_fpn_1x_nuim/mask_rcnn_r101_fpn_1x_nuim_20201024_134803-65c7623a.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_r101_fpn_1x_nuim/mask_rcnn_r101_fpn_1x_nuim_20201024_134803.log.json) |
+| Mask R-CNN | [X-101_32x4d](./mask_rcnn_x101_32x4d_fpn_1x_nuim.py) | IN | 1x | 13.3 | 50.4 | 40.5 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_x101_32x4d_fpn_1x_nuim/mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135741-b699ab37.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/mask_rcnn_x101_32x4d_fpn_1x_nuim/mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135741.log.json) |
+| Cascade Mask R-CNN | [R-50](./cascade_mask_rcnn_r50_fpn_1x_nuim.py) | IN | 1x | 8.9 | 50.8 | 40.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_1x_nuim/cascade_mask_rcnn_r50_fpn_1x_nuim_20201008_195342-1147c036.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_1x_nuim/cascade_mask_rcnn_r50_fpn_1x_nuim_20201008_195342.log.json) |
+| Cascade Mask R-CNN | [R-50](./cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim.py) | IN+COCO-20e | 1x | 8.9 | 52.8 | 42.2 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim_20201009_124158-ad0540e3.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_1x_nuim_20201009_124158.log.json) |
+| Cascade Mask R-CNN | [R-50](./cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim.py) | IN+COCO-20e | 20e | 8.9 | 52.8 | 42.2 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim_20201009_124951-40963960.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim/cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim_20201009_124951.log.json) |
+| Cascade Mask R-CNN | [R-101](./cascade_mask_rcnn_r101_fpn_1x_nuim.py) | IN | 1x | 12.5 | 51.5 | 40.7 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r101_fpn_1x_nuim/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_r101_fpn_1x_nuim/cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804.log.json) |
+| Cascade Mask R-CNN | [X-101_32x4d](./cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim.py) | IN | 1x | 14.9 | 52.8 | 41.6 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135753-e0e49778.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim/cascade_mask_rcnn_x101_32x4d_fpn_1x_nuim_20201024_135753.log.json) |
+| HTC w/o semantic | [R-50](./htc_without_semantic_r50_fpn_1x_nuim.py) | IN | 1x | | [model](<>) \| [log](<>) | | |
+| HTC | [R-50](./htc_r50_fpn_1x_nuim.py) | IN | 1x | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/) | | |
+| HTC | [R-50](./htc_r50_fpn_coco-20e_1x_nuim.py) | IN+COCO-20e | 1x | 11.6 | 53.8 | 43.8 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_1x_nuim/htc_r50_fpn_coco-20e_1x_nuim_20201010_070203-0b53a65e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_1x_nuim/htc_r50_fpn_coco-20e_1x_nuim_20201010_070203.log.json) |
+| HTC | [R-50](./htc_r50_fpn_coco-20e_20e_nuim.py) | IN+COCO-20e | 20e | 11.6 | 54.8 | 44.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_20e_nuim/htc_r50_fpn_coco-20e_20e_nuim_20201008_211415-d6c60a2c.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_r50_fpn_coco-20e_20e_nuim/htc_r50_fpn_coco-20e_20e_nuim_20201008_211415.log.json) |
+| HTC | [X-101_64x4d + DCN_c3-c5](./htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim.py) | IN+COCO-20e | 20e | 13.3 | 57.3 | 46.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim_20201008_211222-0b16ac4b.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/nuimages_semseg/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim/htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim_20201008_211222.log.json) |
**Note**:
+
1. `IN` means only using ImageNet pre-trained backbone. `IN+COCO-Nx` and `IN+COCO-Ne` means the backbone is first pre-trained on ImageNet, and then the detector is pre-trained on COCO train2017 dataset by `Nx` and `N` epochs schedules, respectively.
2. All the training hyper-parameters follow the standard schedules on COCO dataset except that the images are resized from
-1280 x 720 to 1920 x 1080 (relative ratio 0.8 to 1.2) since the images are in size 1600 x 900.
+ 1280 x 720 to 1920 x 1080 (relative ratio 0.8 to 1.2) since the images are in size 1600 x 900.
3. The class order in the detectors released in v0.6.0 is different from the order in the configs because the bug in the conversion script. This bug has been fixed since v0.7.0 and the models trained by the correct class order are also released. If you used nuImages since v0.6.0, please re-convert the data through the conversion script using the above-mentioned command.
diff --git a/configs/paconv/README.md b/configs/paconv/README.md
index 0b2bc7275e..83ab5b0898 100644
--- a/configs/paconv/README.md
+++ b/configs/paconv/README.md
@@ -23,10 +23,10 @@ We implement PAConv and provide the result and checkpoints on S3DIS dataset.
### S3DIS
-| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
-| :-------------------------------------------------------------------------: | :----: | :---------: | :------: | :------------: | :------------: | :----------------------: |
-| [PAConv (SSG)](./paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class.py) | Area_5 | cosine 150e | 5.8 | | 66.65 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class_20210729_200615-2147b2d1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class_20210729_200615.log.json) |
-| [PAConv\* (SSG)](./paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class.py) | Area_5 | cosine 200e | 3.8 | | 65.33 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class_20210802_171802-e5ea9bb9.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class_20210802_171802.log.json) |
+| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
+| :-------------------------------------------------------------------------: | :----: | :---------: | :------: | :------------: | :------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PAConv (SSG)](./paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class.py) | Area_5 | cosine 150e | 5.8 | | 66.65 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class_20210729_200615-2147b2d1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class/paconv_ssg_8x8_cosine_150e_s3dis_seg-3d-13class_20210729_200615.log.json) |
+| [PAConv\* (SSG)](./paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class.py) | Area_5 | cosine 200e | 3.8 | | 65.33 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class_20210802_171802-e5ea9bb9.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/paconv/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class/paconv_cuda_ssg_8x8_cosine_200e_s3dis_seg-3d-13class_20210802_171802.log.json) |
**Notes:**
diff --git a/configs/parta2/README.md b/configs/parta2/README.md
index 816995a33b..b94b8492aa 100644
--- a/configs/parta2/README.md
+++ b/configs/parta2/README.md
@@ -20,10 +20,10 @@ We implement Part-A^2 and provide its results and checkpoints on KITTI dataset.
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: |:-----: | :------: | :------------: | :----: |:----: |
-| [SECFPN](./hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class.py) |3 Class|cyclic 80e|4.1||68.33|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class_20210831_022017-454a5344.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class_20210831_022017.log.json)|
-| [SECFPN](./hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car.py) |Car |cyclic 80e|4.0||79.08|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car_20210831_022017-cb7ff621.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car_20210831_022017.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :------------------------------------------------------------: | :-----: | :--------: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class.py) | 3 Class | cyclic 80e | 4.1 | | 68.33 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class_20210831_022017-454a5344.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-3class_20210831_022017.log.json) |
+| [SECFPN](./hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car.py) | Car | cyclic 80e | 4.0 | | 79.08 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car_20210831_022017-cb7ff621.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/parta2/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car/hv_PartA2_secfpn_2x8_cyclic_80e_kitti-3d-car_20210831_022017.log.json) |
## Citation
diff --git a/configs/pgd/README.md b/configs/pgd/README.md
index 02e5aa5719..f805f53d27 100644
--- a/configs/pgd/README.md
+++ b/configs/pgd/README.md
@@ -26,29 +26,29 @@ A more extensive study based on FCOS3D and PGD is on-going. Please stay tuned.
### KITTI
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP_11 / mAP_40 | Download |
-| :---------: | :-----: | :------: | :------------: | :----: | :------: |
-|[ResNet101](./pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py)|4x|9.07||18.33 / 13.23|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP_11 / mAP_40 | Download |
+| :--------------------------------------------------------------: | :-----: | :------: | :------------: | :-------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [ResNet101](./pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py) | 4x | 9.07 | | 18.33 / 13.23 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608.log.json) |
Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 and AP40 metric:
-| | Easy | Moderate | Hard |
-|-------------|:-------------:|:--------------:|:-------------:|
-| Car (AP11) | 24.09 / 30.11 | 18.33 / 23.46 | 16.90 / 19.33 |
-| Car (AP40) | 19.27 / 26.60 | 13.23 / 18.23 | 10.65 / 15.00 |
+| | Easy | Moderate | Hard |
+| ---------- | :-----------: | :-----------: | :-----------: |
+| Car (AP11) | 24.09 / 30.11 | 18.33 / 23.46 | 16.90 / 19.33 |
+| Car (AP40) | 19.27 / 26.60 | 13.23 / 18.23 | 10.65 / 15.00 |
Note: mAP represents Car moderate 3D strict AP11 / AP40 results. Because of the limited data for pedestrians and cyclists, the detection performance for these two classes is usually unstable. Therefore, we only list car detection results here. In addition, AP40 is a more recommended metric for reference due to its much better stability.
### NuScenes
-| Backbone | Lr schd | Mem (GB) | mAP | NDS | Download |
-| :---------: | :-----: | :------: | :----: |:----: | :------: |
-|[ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py)|1x|9.20|31.7|39.3|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350-f4b5eec2.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350.log.json)|
-|[above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune.py)|1x|9.20|34.6|41.1|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245-fd419681.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245.log.json)|
-|above w/ tta|1x|9.20|35.5|41.8||
-|[ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d.py)|2x|9.20|33.6|40.9|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314-cb677266.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314.log.json)|
-|[above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune.py)|2x|9.20|35.8|42.5|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135-5ec7c1cd.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135.log.json)|
-|above w/ tta|2x|9.20|36.8|43.1||
+| Backbone | Lr schd | Mem (GB) | mAP | NDS | Download |
+| :------------------------------------------------------------------------------: | :-----: | :------: | :--: | :--: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d.py) | 1x | 9.20 | 31.7 | 39.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350-f4b5eec2.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_20211116_195350.log.json) |
+| [above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune.py) | 1x | 9.20 | 34.6 | 41.1 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245-fd419681.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_1x_nus-mono3d_finetune_20211118_093245.log.json) |
+| above w/ tta | 1x | 9.20 | 35.5 | 41.8 | |
+| [ResNet101 w/ DCN](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d.py) | 2x | 9.20 | 33.6 | 40.9 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314-cb677266.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_20211112_125314.log.json) |
+| [above w/ finetune](./pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune.py) | 2x | 9.20 | 35.8 | 42.5 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135-5ec7c1cd.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune/pgd_r101_caffe_fpn_gn-head_2x16_2x_nus-mono3d_finetune_20211114_162135.log.json) |
+| above w/ tta | 2x | 9.20 | 36.8 | 43.1 | |
## Citation
diff --git a/configs/point_rcnn/README.md b/configs/point_rcnn/README.md
index cf59e3f6c5..eddbdc726f 100644
--- a/configs/point_rcnn/README.md
+++ b/configs/point_rcnn/README.md
@@ -20,19 +20,19 @@ We implement PointRCNN and provide the result with checkpoints on KITTI dataset.
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: |:-----: | :------: | :------------: | :----: |:----: |
-| [PointNet++](./point_rcnn_2x8_kitti-3d-3classes.py) |3 Class|cyclic 40e|4.6||70.83|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/point_rcnn/point_rcnn_2x8_kitti-3d-3classes_20211208_151344.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/point_rcnn/point_rcnn_2x8_kitti-3d-3classes_20211208_151344.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :-------------------------------------------------: | :-----: | :--------: | :------: | :------------: | :---: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++](./point_rcnn_2x8_kitti-3d-3classes.py) | 3 Class | cyclic 40e | 4.6 | | 70.83 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/point_rcnn/point_rcnn_2x8_kitti-3d-3classes_20211208_151344.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/point_rcnn/point_rcnn_2x8_kitti-3d-3classes_20211208_151344.log.json) |
Note: mAP represents AP11 results on 3 Class under the moderate setting.
Detailed performance on KITTI 3D detection (3D) is as follows, evaluated by AP11 metric:
-| | Easy | Moderate | Hard |
-|-------------|:-------------:|:--------------:|:------------:|
-| Car | 89.13 | 78.72 | 78.24 |
-| Pedestrian | 65.81 | 59.57 | 52.75 |
-| Cyclist | 93.51 | 74.19 | 70.73 |
+| | Easy | Moderate | Hard |
+| ---------- | :---: | :------: | :---: |
+| Car | 89.13 | 78.72 | 78.24 |
+| Pedestrian | 65.81 | 59.57 | 52.75 |
+| Cyclist | 93.51 | 74.19 | 70.73 |
## Citation
diff --git a/configs/pointnet2/README.md b/configs/pointnet2/README.md
index e91c23f013..c9204eb1ba 100644
--- a/configs/pointnet2/README.md
+++ b/configs/pointnet2/README.md
@@ -22,31 +22,33 @@ We implement PointNet++ and provide the result and checkpoints on ScanNet and S3
### ScanNet
-| Method | Input | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | mIoU (Test set) | Download |
-| :-------------------------------------------------------------------------------------: | :-------: | :---------: | :------: | :------------: | :------------: | :-------------: | ------------------------ |
-| [PointNet++ (SSG)](./pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class.py) | XYZ | cosine 200e | 1.9 | | 53.91 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143628-4e341a48.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143628.log.json) |
-| [PointNet++ (SSG)](./pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class.py) | XYZ+Color | cosine 200e | 1.9 | | 54.44 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143644-ee73704a.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143644.log.json) |
-| [PointNet++ (MSG)](./pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class.py) | XYZ | cosine 250e | 2.4 | | 54.26 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class_20210514_143838-b4a3cf89.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class_20210514_143838.log.json) |
-| [PointNet++ (MSG)](./pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class.py) | XYZ+Color | cosine 250e | 2.4 | | 55.05 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class_20210514_144009-24477ab1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class_20210514_144009.log.json) |
+| Method | Input | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | mIoU (Test set) | Download |
+| :-------------------------------------------------------------------------------------: | :-------: | :---------: | :------: | :------------: | :------------: | :-------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [PointNet++ (SSG)](./pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class.py) | XYZ | cosine 200e | 1.9 | | 53.91 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143628-4e341a48.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_xyz-only_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143628.log.json) |
+| [PointNet++ (SSG)](./pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class.py) | XYZ+Color | cosine 200e | 1.9 | | 54.44 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143644-ee73704a.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class_20210514_143644.log.json) |
+| [PointNet++ (MSG)](./pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class.py) | XYZ | cosine 250e | 2.4 | | 54.26 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class_20210514_143838-b4a3cf89.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_xyz-only_16x2_cosine_250e_scannet_seg-3d-20class_20210514_143838.log.json) |
+| [PointNet++ (MSG)](./pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class.py) | XYZ+Color | cosine 250e | 2.4 | | 55.05 | | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class_20210514_144009-24477ab1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class/pointnet2_msg_16x2_cosine_250e_scannet_seg-3d-20class_20210514_144009.log.json) |
**Notes:**
-- The original PointNet++ paper conducted experiments on the ScanNet V1 dataset, while later point cloud segmentor papers often used ScanNet V2. Following common practice, we report results on the ScanNet V2 dataset.
-- Since ScanNet dataset doesn't provide ground-truth labels for the test set, users can only evaluate test set performance by submitting to its online benchmark [website](http://kaldir.vc.in.tum.de/scannet_benchmark/). However, users are only allowed to submit once every two weeks. Therefore, we currently report val set mIoU. Test set performance may be added in the future.
-- To generate submission file for ScanNet online benchmark, you need to modify the ScanNet dataset's [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/_base_/datasets/scannet_seg-3d-20class.py#L126). Change `ann_file=data_root + 'scannet_infos_val.pkl'` to `ann_file=data_root + 'scannet_infos_test.pkl'`, and then simply run:
+- The original PointNet++ paper conducted experiments on the ScanNet V1 dataset, while later point cloud segmentor papers often used ScanNet V2. Following common practice, we report results on the ScanNet V2 dataset.
- ```shell
- python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --format-only --options 'txt_prefix=exps/pointnet2_scannet_results'
- ```
+- Since ScanNet dataset doesn't provide ground-truth labels for the test set, users can only evaluate test set performance by submitting to its online benchmark [website](http://kaldir.vc.in.tum.de/scannet_benchmark/). However, users are only allowed to submit once every two weeks. Therefore, we currently report val set mIoU. Test set performance may be added in the future.
- This will save the prediction results as `txt` files in `exps/pointnet2_scannet_results/`. Then, go to this folder and zip all files into `pn2_scannet.zip`. Now you can submit it to the online benchmark and wait for the test set result. More instructions can be found at their official [website](http://kaldir.vc.in.tum.de/scannet_benchmark/documentation#submission-policy).
+- To generate submission file for ScanNet online benchmark, you need to modify the ScanNet dataset's [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/_base_/datasets/scannet_seg-3d-20class.py#L126). Change `ann_file=data_root + 'scannet_infos_val.pkl'` to `ann_file=data_root + 'scannet_infos_test.pkl'`, and then simply run:
+
+ ```shell
+ python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --format-only --options 'txt_prefix=exps/pointnet2_scannet_results'
+ ```
+
+ This will save the prediction results as `txt` files in `exps/pointnet2_scannet_results/`. Then, go to this folder and zip all files into `pn2_scannet.zip`. Now you can submit it to the online benchmark and wait for the test set result. More instructions can be found at their official [website](http://kaldir.vc.in.tum.de/scannet_benchmark/documentation#submission-policy).
### S3DIS
-| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
-| :-------------------------------------------------------------------------: | :----: | :--------: | :------: | :------------: | :------------: | :----------------------: |
-| [PointNet++ (SSG)](./pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class.py) | Area_5 | cosine 50e | 3.6 | | 56.93 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class_20210514_144205-995d0119.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class_20210514_144205.log.json) |
-| [PointNet++ (MSG)](./pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class.py) | Area_5 | cosine 80e | 3.6 | | 58.04 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class_20210514_144307-b2059817.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class_20210514_144307.log.json) |
+| Method | Split | Lr schd | Mem (GB) | Inf time (fps) | mIoU (Val set) | Download |
+| :-------------------------------------------------------------------------: | :----: | :--------: | :------: | :------------: | :------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++ (SSG)](./pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class.py) | Area_5 | cosine 50e | 3.6 | | 56.93 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class_20210514_144205-995d0119.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class/pointnet2_ssg_16x2_cosine_50e_s3dis_seg-3d-13class_20210514_144205.log.json) |
+| [PointNet++ (MSG)](./pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class.py) | Area_5 | cosine 80e | 3.6 | | 58.04 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class_20210514_144307-b2059817.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointnet2/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class/pointnet2_msg_16x2_cosine_80e_s3dis_seg-3d-13class_20210514_144307.log.json) |
**Notes:**
diff --git a/configs/pointpillars/README.md b/configs/pointpillars/README.md
index 1c9d3a83b0..62090972fe 100644
--- a/configs/pointpillars/README.md
+++ b/configs/pointpillars/README.md
@@ -20,48 +20,48 @@ We implement PointPillars and provide the results and checkpoints on KITTI, nuSc
### KITTI
-| Backbone|Class | Lr schd | Mem (GB) | Inf time (fps) | AP |Download |
-| :---------: | :-----: |:-----: | :------: | :------------: | :----: | :------: |
-| [SECFPN](./hv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py)|Car|cyclic 160e|5.4||77.6|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20220331_134606-d42d15ed.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20220331_134606.log.json)|
-| [SECFPN](./hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py)|3 Class|cyclic 160e|5.5||64.07|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20220301_150306-37dc2420.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20220301_150306.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | AP | Download |
+| :------------------------------------------------------------: | :-----: | :---------: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py) | Car | cyclic 160e | 5.4 | | 77.6 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20220331_134606-d42d15ed.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20220331_134606.log.json) |
+| [SECFPN](./hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py) | 3 Class | cyclic 160e | 5.5 | | 64.07 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20220301_150306-37dc2420.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20220301_150306.log.json) |
### nuScenes
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP |NDS| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](./hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||34.33|49.1|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20210826_225857-f19d00a3.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20210826_225857.log.json)|
-|[SECFPN (FP16)](./hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d.py)|2x|8.37||35.19|50.27|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d_20201020_222626-c3f0483e.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d_20201020_222626.log.json)|
-|[FPN](./hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.3||39.7|53.2|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20210826_104936-fca299c1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20210826_104936.log.json)|
-|[FPN (FP16)](./hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d.py)|2x|8.40||39.26|53.26|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d_20201021_120719-269f9dd6.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d_20201021_120719.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :---------------------------------------------------------------------: | :-----: | :------: | :------------: | :---: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.4 | | 34.33 | 49.1 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20210826_225857-f19d00a3.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20210826_225857.log.json) |
+| [SECFPN (FP16)](./hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d.py) | 2x | 8.37 | | 35.19 | 50.27 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d_20201020_222626-c3f0483e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_fp16_2x8_2x_nus-3d_20201020_222626.log.json) |
+| [FPN](./hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.3 | | 39.7 | 53.2 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20210826_104936-fca299c1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20210826_104936.log.json) |
+| [FPN (FP16)](./hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d.py) | 2x | 8.40 | | 39.26 | 53.26 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d_20201021_120719-269f9dd6.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_fp16_2x8_2x_nus-3d_20201021_120719.log.json) |
### Lyft
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](./hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py)|2x|12.2||13.8|14.1|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210829_100455-82b81c39.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210829_100455.log.json)|
-|[FPN](./hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py)|2x|9.2||14.8|15.0|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210822_095429-0b3d6196.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210822_095429.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
+| :----------------------------------------------------------: | :-----: | :------: | :------------: | :-----------: | :----------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py) | 2x | 12.2 | | 13.8 | 14.1 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210829_100455-82b81c39.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210829_100455.log.json) |
+| [FPN](./hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py) | 2x | 9.2 | | 14.8 | 15.0 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210822_095429-0b3d6196.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210822_095429.log.json) |
### Waymo
-| Backbone | Load Interval | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP@L1 | mAPH@L1 | mAP@L2 | **mAPH@L2** | Download |
-| :-------: | :-----------: |:-----:| :------:| :------: | :------------: | :----: | :-----: | :-----: | :-----: | :------: |
-| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car.py)|5|Car|2x|7.76||70.2|69.6|62.6|62.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car_20200901_204315-302fc3e7.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car_20200901_204315.log.json)|
-| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py)|5|3 Class|2x|8.12||64.7|57.6|58.4|52.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class_20200831_204144-d1a706b1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class_20200831_204144.log.json)|
-| above @ Car|||2x|8.12||68.5|67.9|60.1|59.6| |
-| above @ Pedestrian|||2x|8.12||67.8|50.6|59.6|44.3| |
-| above @ Cyclist|||2x|8.12||57.7|54.4|55.5|52.4| |
-| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car.py)|1|Car|2x|7.76||72.1|71.5|63.6|63.1|[log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car.log.json)|
-| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class.py)|1|3 Class|2x|8.12||68.8|63.3|62.6|57.6|[log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class.log.json)|
-| above @ Car|||2x|8.12||71.6|71.0|63.1|62.5| |
-| above @ Pedestrian|||2x|8.12||70.6|56.7|62.9|50.2| |
-| above @ Cyclist|||2x|8.12||64.4|62.3|61.9|59.9| |
+| Backbone | Load Interval | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP@L1 | mAPH@L1 | mAP@L2 | **mAPH@L2** | Download |
+| :-----------------------------------------------------------------: | :-----------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :----: | :---------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car.py) | 5 | Car | 2x | 7.76 | | 70.2 | 69.6 | 62.6 | 62.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car_20200901_204315-302fc3e7.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-car_20200901_204315.log.json) |
+| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py) | 5 | 3 Class | 2x | 8.12 | | 64.7 | 57.6 | 58.4 | 52.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class_20200831_204144-d1a706b1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymoD5-3d-3class_20200831_204144.log.json) |
+| above @ Car | | | 2x | 8.12 | | 68.5 | 67.9 | 60.1 | 59.6 | |
+| above @ Pedestrian | | | 2x | 8.12 | | 67.8 | 50.6 | 59.6 | 44.3 | |
+| above @ Cyclist | | | 2x | 8.12 | | 57.7 | 54.4 | 55.5 | 52.4 | |
+| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car.py) | 1 | Car | 2x | 7.76 | | 72.1 | 71.5 | 63.6 | 63.1 | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car.log.json) |
+| [SECFPN](./hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class.py) | 1 | 3 Class | 2x | 8.12 | | 68.8 | 63.3 | 62.6 | 57.6 | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-3class.log.json) |
+| above @ Car | | | 2x | 8.12 | | 71.6 | 71.0 | 63.1 | 62.5 | |
+| above @ Pedestrian | | | 2x | 8.12 | | 70.6 | 56.7 | 62.9 | 50.2 | |
+| above @ Cyclist | | | 2x | 8.12 | | 64.4 | 62.3 | 61.9 | 59.9 | |
#### Note:
- **Metric**: For model trained with 3 classes, the average APH@L2 (mAPH@L2) of all the categories is reported and used to rank the model. For model trained with only 1 class, the APH@L2 is reported and used to rank the model.
- **Data Split**: Here we provide several baselines for waymo dataset, among which D5 means that we divide the dataset into 5 folds and only use one fold for efficient experiments. Using the complete dataset can boost the performance a lot, especially for the detection of cyclist and pedestrian, where more than 5 mAP or mAPH improvement can be expected.
- **Implementation Details**: We basically follow the implementation in the [paper](https://arxiv.org/pdf/1912.04838.pdf) in terms of the network architecture (having a
-stride of 1 for the first convolutional block). Different settings of voxelization, data augmentation and hyper parameters make these baselines outperform those in the paper by about 7 mAP for car and 4 mAP for pedestrian with only a subset of the whole dataset. All of these results are achieved without bells-and-whistles, e.g. ensemble, multi-scale training and test augmentation.
+ stride of 1 for the first convolutional block). Different settings of voxelization, data augmentation and hyper parameters make these baselines outperform those in the paper by about 7 mAP for car and 4 mAP for pedestrian with only a subset of the whole dataset. All of these results are achieved without bells-and-whistles, e.g. ensemble, multi-scale training and test augmentation.
- **License Aggrement**: To comply the [license agreement of Waymo dataset](https://waymo.com/open/terms/), the pre-trained models on Waymo dataset are not released. We still release the training log as a reference to ease the future research.
- `FP16` means Mixed Precision (FP16) is adopted in training. With mixed precision training, we can train PointPillars with nuScenes dataset on 8 Titan XP GPUS with batch size of 2. This will cause OOM error without mixed precision training. The loss scale for PointPillars on nuScenes dataset is specifically tuned to avoid the loss to be Nan. We find 32 is more stable than 512, though loss scale 32 still cause Nan sometimes.
diff --git a/configs/regnet/README.md b/configs/regnet/README.md
index f981123937..f15b94fe2d 100644
--- a/configs/regnet/README.md
+++ b/configs/regnet/README.md
@@ -21,6 +21,7 @@ The pre-trained modles are converted from [model zoo of pycls](https://github.co
## Usage
To use a regnet model, there are two steps to do:
+
1. Convert the model to ResNet-style supported by MMDetection
2. Modify backbone and neck in config accordingly
@@ -34,8 +35,8 @@ ResNet-style checkpoints used in MMDetection.
```bash
python -u tools/model_converters/regnet2mmdet.py ${PRETRAIN_PATH} ${STORE_PATH}
```
-This script convert model from `PRETRAIN_PATH` and store the converted model in `STORE_PATH`.
+This script convert model from `PRETRAIN_PATH` and store the converted model in `STORE_PATH`.
### Modify config
@@ -50,22 +51,22 @@ For other pre-trained models or self-implemented regnet models, the users are re
### nuScenes
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP |NDS| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||35.17|49.7|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json)|
-|[RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py)| 2x |16.4||41.2|55.2|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json)|
-|[FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py)|2x|17.1||40.0|53.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405.log.json)|
-|[RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d.py)|2x|17.3||44.8|56.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239-c694dce7.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239.log.json)|
-|[RegNetX-1.6gF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d.py)|2x|24.0||48.2|59.3|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d_20200629_050311-dcd4e090.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d_20200629_050311.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :------------------------------------------------------------------------------------: | :-----: | :------: | :------------: | :---: | :--: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.4 | | 35.17 | 49.7 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json) |
+| [RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.4 | | 41.2 | 55.2 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json) |
+| [FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 17.1 | | 40.0 | 53.3 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405-2fa62f3d.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_fpn_sbn-all_4x8_2x_nus-3d_20200620_230405.log.json) |
+| [RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 17.3 | | 44.8 | 56.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239-c694dce7.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_nus-3d_20200620_230239.log.json) |
+| [RegNetX-1.6gF-FPN](./hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 24.0 | | 48.2 | 59.3 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d_20200629_050311-dcd4e090.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-1.6gf_fpn_sbn-all_4x8_2x_nus-3d_20200629_050311.log.json) |
### Lyft
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py)|2x|12.2||13.9|14.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807-2518e3de.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807.log.json)|
-|[RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_lyft-3d.py)| 2x |15.9||14.9|15.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d_20210524_092151-42513826.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d_20210524_092151.log.json)|
-|[FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py)|2x|9.2||14.9|15.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818-fc6904c3.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818.log.json)|
-|[RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_lyft-3d.py)|2x|13.0||16.0|16.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d_20210521_115618-823dcf18.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d_20210521_115618.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
+| :-------------------------------------------------------------------------------------: | :-----: | :------: | :------------: | :-----------: | :----------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py) | 2x | 12.2 | | 13.9 | 14.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807-2518e3de.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807.log.json) |
+| [RegNetX-400MF-SECFPN](./hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_lyft-3d.py) | 2x | 15.9 | | 14.9 | 15.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d_20210524_092151-42513826.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_2x8_2x_lyft-3d_20210524_092151.log.json) |
+| [FPN](../pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d.py) | 2x | 9.2 | | 14.9 | 15.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818-fc6904c3.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_fpn_sbn-all_2x8_2x_lyft-3d_20210517_202818.log.json) |
+| [RegNetX-400MF-FPN](./hv_pointpillars_regnet-400mf_fpn_sbn-all_4x8_2x_lyft-3d.py) | 2x | 13.0 | | 16.0 | 16.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d_20210521_115618-823dcf18.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_regnet-400mf_fpn_sbn-all_2x8_2x_lyft-3d_20210521_115618.log.json) |
## Citation
diff --git a/configs/second/README.md b/configs/second/README.md
index fdbdbd0bf6..1aa9650126 100644
--- a/configs/second/README.md
+++ b/configs/second/README.md
@@ -20,21 +20,21 @@ We implement SECOND and provide the results and checkpoints on KITTI dataset.
### KITTI
-| Backbone |Class| Lr schd | Mem (GB) | Inf time (fps) | mAP |Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [SECFPN](./hv_second_secfpn_6x8_80e_kitti-3d-car.py)| Car |cyclic 80e|5.4||79.07|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-car/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-car/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238.log.json)|
-| [SECFPN (FP16)](./hv_second_secfpn_fp16_6x8_80e_kitti-3d-car.py)| Car |cyclic 80e|2.9||78.72|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car_20200924_211301-1f5ad833.pth)| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car_20200924_211301.log.json)|
-| [SECFPN](./hv_second_secfpn_6x8_80e_kitti-3d-3class.py)| 3 Class |cyclic 80e|5.4||65.74|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-3class/hv_second_secfpn_6x8_80e_kitti-3d-3class_20210831_022017-ae782e87.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-3class/hv_second_secfpn_6x8_80e_kitti-3d-3class_20210831_022017log.json)|
-| [SECFPN (FP16)](./hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class.py)| 3 Class |cyclic 80e|2.9||67.4|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class_20200925_110059-05f67bdf.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class_20200925_110059.log.json)|
+| Backbone | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :-----------------------------------------------------------------: | :-----: | :--------: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_second_secfpn_6x8_80e_kitti-3d-car.py) | Car | cyclic 80e | 5.4 | | 79.07 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-car/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-car/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238.log.json) |
+| [SECFPN (FP16)](./hv_second_secfpn_fp16_6x8_80e_kitti-3d-car.py) | Car | cyclic 80e | 2.9 | | 78.72 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car_20200924_211301-1f5ad833.pth)\| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car/hv_second_secfpn_fp16_6x8_80e_kitti-3d-car_20200924_211301.log.json) |
+| [SECFPN](./hv_second_secfpn_6x8_80e_kitti-3d-3class.py) | 3 Class | cyclic 80e | 5.4 | | 65.74 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-3class/hv_second_secfpn_6x8_80e_kitti-3d-3class_20210831_022017-ae782e87.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/second/hv_second_secfpn_6x8_80e_kitti-3d-3class/hv_second_secfpn_6x8_80e_kitti-3d-3class_20210831_022017log.json) |
+| [SECFPN (FP16)](./hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class.py) | 3 Class | cyclic 80e | 2.9 | | 67.4 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class_20200925_110059-05f67bdf.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/fp16/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class/hv_second_secfpn_fp16_6x8_80e_kitti-3d-3class_20200925_110059.log.json) |
### Waymo
-| Backbone | Load Interval | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP@L1 | mAPH@L1 | mAP@L2 | **mAPH@L2** | Download |
-| :-------: | :-----------: |:-----:| :------:| :------: | :------------: | :----: | :-----: | :-----: | :-----: | :------: |
-| [SECFPN](./hv_second_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py)|5|3 Class|2x|8.12||65.3|61.7|58.9|55.7|[log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_sbn_4x8_2x_waymoD5-3d-3class/hv_second_secfpn_sbn_4x8_2x_waymoD5-3d-3class_20201115_112448.log.json)|
-| above @ Car|||2x|8.12||67.1|66.6|58.7|58.2| |
-| above @ Pedestrian|||2x|8.12||68.1|59.1|59.5|51.5| |
-| above @ Cyclist|||2x|8.12||60.7|59.5|58.4|57.3| |
+| Backbone | Load Interval | Class | Lr schd | Mem (GB) | Inf time (fps) | mAP@L1 | mAPH@L1 | mAP@L2 | **mAPH@L2** | Download |
+| :-----------------------------------------------------------: | :-----------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :----: | :---------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](./hv_second_secfpn_sbn_2x16_2x_waymoD5-3d-3class.py) | 5 | 3 Class | 2x | 8.12 | | 65.3 | 61.7 | 58.9 | 55.7 | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/second/hv_second_secfpn_sbn_4x8_2x_waymoD5-3d-3class/hv_second_secfpn_sbn_4x8_2x_waymoD5-3d-3class_20201115_112448.log.json) |
+| above @ Car | | | 2x | 8.12 | | 67.1 | 66.6 | 58.7 | 58.2 | |
+| above @ Pedestrian | | | 2x | 8.12 | | 68.1 | 59.1 | 59.5 | 51.5 | |
+| above @ Cyclist | | | 2x | 8.12 | | 60.7 | 59.5 | 58.4 | 57.3 | |
Note:
diff --git a/configs/smoke/README.md b/configs/smoke/README.md
index 9c3a4bf3d8..8d91314d14 100644
--- a/configs/smoke/README.md
+++ b/configs/smoke/README.md
@@ -20,19 +20,19 @@ We implement SMOKE and provide the results and checkpoints on KITTI dataset.
### KITTI
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
-| :---------: | :-----: | :------: | :------------: | :----: | :------: |
-|[DLA34](./smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d.py)|6x|9.64||13.85|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/smoke/smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d_20210929_015553-d46d9bb0.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/smoke/smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d_20210929_015553.log.json)
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | Download |
+| :------------------------------------------------------------------: | :-----: | :------: | :------------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [DLA34](./smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d.py) | 6x | 9.64 | | 13.85 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/smoke/smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d_20210929_015553-d46d9bb0.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/smoke/smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d_20210929_015553.log.json) |
Note: mAP represents Car moderate 3D strict AP11 results.
Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 metric:
-| | Easy | Moderate | Hard |
-|-------------|:-------------:|:--------------:|:------------:|
-| Car | 16.92 / 22.97 | 13.85 / 18.32 | 11.90 / 15.88|
-| Pedestrian | 11.13 / 12.61| 11.10 / 11.32 | 10.67 / 11.14|
-| Cyclist | 0.99 / 1.47 | 0.54 / 0.65 | 0.55 / 0.67 |
+| | Easy | Moderate | Hard |
+| ---------- | :-----------: | :-----------: | :-----------: |
+| Car | 16.92 / 22.97 | 13.85 / 18.32 | 11.90 / 15.88 |
+| Pedestrian | 11.13 / 12.61 | 11.10 / 11.32 | 10.67 / 11.14 |
+| Cyclist | 0.99 / 1.47 | 0.54 / 0.65 | 0.55 / 0.67 |
## Citation
diff --git a/configs/ssn/README.md b/configs/ssn/README.md
index 2327e5a7fe..dad03f8655 100644
--- a/configs/ssn/README.md
+++ b/configs/ssn/README.md
@@ -20,20 +20,20 @@ We implement PointPillars with Shape-aware grouping heads used in the SSN and pr
### NuScenes
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||35.17|49.76|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json)|
-|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|3.6||40.91|54.44|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20210830_101351-51915986.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20210830_101351.log.json)|
-[RegNetX-400MF-SECFPN](../regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py)|2x|16.4||41.15|55.20|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json)|
-|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py)|2x|5.1||46.65|58.24|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20210829_210615-361e5e04.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20210829_210615.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
+| :--------------------------------------------------------------------------------------------: | :-----: | :------: | :------------: | :---: | :---: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.4 | | 35.17 | 49.76 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725-0817d270.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230725.log.json) |
+| [SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d.py) | 2x | 3.6 | | 40.91 | 54.44 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20210830_101351-51915986.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_secfpn_sbn-all_2x16_2x_nus-3d_20210830_101351.log.json) |
+| [RegNetX-400MF-SECFPN](../regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d.py) | 2x | 16.4 | | 41.15 | 55.20 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334-53044f32.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/regnet/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d/hv_pointpillars_regnet-400mf_secfpn_sbn-all_4x8_2x_nus-3d_20200620_230334.log.json) |
+| [RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d.py) | 2x | 5.1 | | 46.65 | 58.24 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20210829_210615-361e5e04.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_2x16_2x_nus-3d_20210829_210615.log.json) |
### Lyft
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-|[SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py)|2x|12.2||13.9|14.1|[model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807-2518e3de.pth) | [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807.log.json)|
-|[SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py)|2x|8.5||17.5|17.5|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20210822_134731-46841b41.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20210822_134731.log.json)|
-|[RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py)|2x|7.4||17.9|18|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d_20210829_122825-d93475a1.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d_20210829_122825.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | Private Score | Public Score | Download |
+| :--------------------------------------------------------------------------: | :-----: | :------: | :------------: | :-----------: | :----------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [SECFPN](../pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d.py) | 2x | 12.2 | | 13.9 | 14.1 | [model](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807-2518e3de.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d/hv_pointpillars_secfpn_sbn-all_2x8_2x_lyft-3d_20210517_204807.log.json) |
+| [SSN](./hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d.py) | 2x | 8.5 | | 17.5 | 17.5 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20210822_134731-46841b41.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d/hv_ssn_secfpn_sbn-all_2x16_2x_lyft-3d_20210822_134731.log.json) |
+| [RegNetX-400MF-SSN](./hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d.py) | 2x | 7.4 | | 17.9 | 18 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d_20210829_122825-d93475a1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/ssn/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d/hv_ssn_regnet-400mf_secfpn_sbn-all_1x16_2x_lyft-3d_20210829_122825.log.json) |
Note:
diff --git a/configs/votenet/README.md b/configs/votenet/README.md
index 694e552cca..d74486f033 100644
--- a/configs/votenet/README.md
+++ b/configs/votenet/README.md
@@ -20,17 +20,17 @@ We implement VoteNet and provide the result and checkpoints on ScanNet and SUNRG
### ScanNet
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet++](./votenet_8x8_scannet-3d-18class.py) | 3x |4.1||62.34|40.82|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_8x8_scannet-3d-18class/votenet_8x8_scannet-3d-18class_20210823_234503-cf8134fa.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_8x8_scannet-3d-18class/votenet_8x8_scannet-3d-18class_20210823_234503.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :-----------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++](./votenet_8x8_scannet-3d-18class.py) | 3x | 4.1 | | 62.34 | 40.82 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_8x8_scannet-3d-18class/votenet_8x8_scannet-3d-18class_20210823_234503-cf8134fa.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_8x8_scannet-3d-18class/votenet_8x8_scannet-3d-18class_20210823_234503.log.json) |
### SUNRGBD
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet++](./votenet_16x8_sunrgbd-3d-10class.py) | 3x |8.1||59.78|35.77|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_16x8_sunrgbd-3d-10class/votenet_16x8_sunrgbd-3d-10class_20210820_162823-bf11f014.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_16x8_sunrgbd-3d-10class/votenet_16x8_sunrgbd-3d-10class_20210820_162823.log.json)|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :------------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [PointNet++](./votenet_16x8_sunrgbd-3d-10class.py) | 3x | 8.1 | | 59.78 | 35.77 | [model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_16x8_sunrgbd-3d-10class/votenet_16x8_sunrgbd-3d-10class_20210820_162823-bf11f014.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/votenet/votenet_16x8_sunrgbd-3d-10class/votenet_16x8_sunrgbd-3d-10class_20210820_162823.log.json) |
-**Notice**: If your current mmdetection3d version >= 0.6.0, and you are using the checkpoints downloaded from the above links or using checkpoints trained with mmdetection3d version < 0.6.0, the checkpoints have to be first converted via [tools/model_converters/convert_votenet_checkpoints.py](../../tools/model_converters/convert_votenet_checkpoints.py):
+**Notice**: If your current mmdetection3d version >= 0.6.0, and you are using the checkpoints downloaded from the above links or using checkpoints trained with mmdetection3d version \< 0.6.0, the checkpoints have to be first converted via [tools/model_converters/convert_votenet_checkpoints.py](../../tools/model_converters/convert_votenet_checkpoints.py):
```
python ./tools/model_converters/convert_votenet_checkpoints.py ${ORIGINAL_CHECKPOINT_PATH} --out=${NEW_CHECKPOINT_PATH}
@@ -50,9 +50,9 @@ Adding IoU loss (simply = 1-IoU) boosts VoteNet's performance. To use IoU loss,
iou_loss=dict(type='AxisAlignedIoULoss', reduction='sum', loss_weight=10.0 / 3.0)
```
-| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 |AP@0.5| Download |
-| :---------: | :-----: | :------: | :------------: | :----: |:----: | :------: |
-| [PointNet++](./votenet_iouloss_8x8_scannet-3d-18class.py) | 3x |4.1||63.81|44.21|/|
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | AP@0.25 | AP@0.5 | Download |
+| :-------------------------------------------------------: | :-----: | :------: | :------------: | :-----: | :----: | :------: |
+| [PointNet++](./votenet_iouloss_8x8_scannet-3d-18class.py) | 3x | 4.1 | | 63.81 | 44.21 | / |
For now, we only support calculating IoU loss for axis-aligned bounding boxes since the CUDA op of general 3D IoU calculation does not implement the backward method. Therefore, IoU loss can only be used for ScanNet dataset for now.
diff --git a/data/s3dis/README.md b/data/s3dis/README.md
index db9f1d4de1..61d75e1535 100644
--- a/data/s3dis/README.md
+++ b/data/s3dis/README.md
@@ -2,7 +2,7 @@
We follow the procedure in [pointnet](https://github.com/charlesq34/pointnet).
-1. Download S3DIS data by filling this [Google form](https://docs.google.com/forms/d/e/1FAIpQLScDimvNMCGhy_rmBA2gHfDu3naktRm6A8BPwAWWDv-Uhm6Shw/viewform?c=0&w=1). Download the ```Stanford3dDataset_v1.2_Aligned_Version.zip``` file and unzip it. Link or move the folder to this level of directory.
+1. Download S3DIS data by filling this [Google form](https://docs.google.com/forms/d/e/1FAIpQLScDimvNMCGhy_rmBA2gHfDu3naktRm6A8BPwAWWDv-Uhm6Shw/viewform?c=0&w=1). Download the `Stanford3dDataset_v1.2_Aligned_Version.zip` file and unzip it. Link or move the folder to this level of directory.
2. In this directory, extract point clouds and annotations by running `python collect_indoor3d_data.py`.
diff --git a/docs/en/1_exist_data_model.md b/docs/en/1_exist_data_model.md
index c4f777f99b..96039c93b2 100644
--- a/docs/en/1_exist_data_model.md
+++ b/docs/en/1_exist_data_model.md
@@ -32,6 +32,7 @@ python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [-
For now, CPU testing is only supported for SMOKE.
Optional arguments:
+
- `RESULT_FILE`: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
- `EVAL_METRICS`: Items to be evaluated on the results. Allowed values depend on the dataset. Typically we default to use official metrics for evaluation on different datasets, so it can be simply set to `mAP` as a placeholder for detection tasks, which applies to nuScenes, Lyft, ScanNet and SUNRGBD. For KITTI, if we only want to evaluate the 2D detection performance, we can simply set the metric to `img_bbox` (unstable, stay tuned). For Waymo, we provide both KITTI-style evaluation (unstable) and Waymo-style official protocol, corresponding to metric `kitti` and `waymo` respectively. We recommend to use the default official metric for stable performance and fair comparison with other methods. Similarly, the metric can be set to `mIoU` for segmentation tasks, which applies to S3DIS and ScanNet.
- `--show`: If specified, detection results will be plotted in the silient mode. It is only applicable to single GPU testing and used for debugging and visualization. This should be used with `--show-dir`.
@@ -182,6 +183,7 @@ Optional arguments are:
- `--options 'Key=value'`: Override some settings in the used config.
Difference between `resume-from` and `load-from`:
+
- `resume-from` loads both the model weights and optimizer status, and the epoch is also inherited from the specified checkpoint. It is usually used for resuming the training process that is interrupted accidentally.
- `load-from` only loads the model weights and the training epoch starts from 0. It is usually used for finetuning.
@@ -217,7 +219,6 @@ NNODES=2 NODE_RANK=1 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR ./tools/dist_tra
Usually it is slow if you do not have high speed networking like InfiniBand.
-
### Launch multiple jobs on a single machine
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs,
diff --git a/docs/en/benchmarks.md b/docs/en/benchmarks.md
index bc3b2b37df..8c71b4059f 100644
--- a/docs/en/benchmarks.md
+++ b/docs/en/benchmarks.md
@@ -1,4 +1,3 @@
-
# Benchmarks
Here we benchmark the training and testing speed of models in MMDetection3D,
@@ -6,130 +5,130 @@ with some other open source 3D detection codebases.
## Settings
-* Hardwares: 8 NVIDIA Tesla V100 (32G) GPUs, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
-* Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.3, numba 0.48.0.
-* Model: Since all the other codebases implements different models, we compare the corresponding models including SECOND, PointPillars, Part-A2, and VoteNet with them separately.
-* Metrics: We use the average throughput in iterations of the entire training run and skip the first 50 iterations of each epoch to skip GPU warmup time.
+- Hardwares: 8 NVIDIA Tesla V100 (32G) GPUs, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
+- Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.3, numba 0.48.0.
+- Model: Since all the other codebases implements different models, we compare the corresponding models including SECOND, PointPillars, Part-A2, and VoteNet with them separately.
+- Metrics: We use the average throughput in iterations of the entire training run and skip the first 50 iterations of each epoch to skip GPU warmup time.
## Main Results
We compare the training speed (samples/s) with other codebases if they implement the similar models. The results are as below, the greater the numbers in the table, the faster of the training process. The models that are not supported by other codebases are marked by `×`.
-| Methods | MMDetection3D | OpenPCDet |votenet| Det3D |
-|:-------:|:-------------:|:---------:|:-----:|:-----:|
-| VoteNet | 358 | × | 77 | × |
-| PointPillars-car| 141 | × | × | 140 |
-| PointPillars-3class| 107 |44 | × | × |
-| SECOND| 40 |30 | × | × |
-| Part-A2| 17 |14 | × | × |
+| Methods | MMDetection3D | OpenPCDet | votenet | Det3D |
+| :-----------------: | :-----------: | :-------: | :-----: | :---: |
+| VoteNet | 358 | × | 77 | × |
+| PointPillars-car | 141 | × | × | 140 |
+| PointPillars-3class | 107 | 44 | × | × |
+| SECOND | 40 | 30 | × | × |
+| Part-A2 | 17 | 14 | × | × |
## Details of Comparison
### Modification for Calculating Speed
-* __MMDetection3D__: We try to use as similar settings as those of other codebases as possible using [benchmark configs](https://github.com/open-mmlab/MMDetection3D/blob/master/configs/benchmark).
+- __MMDetection3D__: We try to use as similar settings as those of other codebases as possible using [benchmark configs](https://github.com/open-mmlab/MMDetection3D/blob/master/configs/benchmark).
+
+- __Det3D__: For comparison with Det3D, we use the commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7).
-* __Det3D__: For comparison with Det3D, we use the commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7).
+- __OpenPCDet__: For comparison with OpenPCDet, we use the commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2).
-* __OpenPCDet__: For comparison with OpenPCDet, we use the commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2).
+ For training speed, we add code to record the running time in the file `./tools/train_utils/train_utils.py`. We calculate the speed of each epoch, and report the average speed of all the epochs.
- For training speed, we add code to record the running time in the file `./tools/train_utils/train_utils.py`. We calculate the speed of each epoch, and report the average speed of all the epochs.
-
+
(diff to make it use the same method for benchmarking speed - click to expand)
- ```diff
- diff --git a/tools/train_utils/train_utils.py b/tools/train_utils/train_utils.py
- index 91f21dd..021359d 100644
- --- a/tools/train_utils/train_utils.py
- +++ b/tools/train_utils/train_utils.py
- @@ -2,6 +2,7 @@ import torch
- import os
- import glob
- import tqdm
- +import datetime
- from torch.nn.utils import clip_grad_norm_
-
-
- @@ -13,7 +14,10 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
- if rank == 0:
- pbar = tqdm.tqdm(total=total_it_each_epoch, leave=leave_pbar, desc='train', dynamic_ncols=True)
-
- + start_time = None
- for cur_it in range(total_it_each_epoch):
- + if cur_it > 49 and start_time is None:
- + start_time = datetime.datetime.now()
- try:
- batch = next(dataloader_iter)
- except StopIteration:
- @@ -55,9 +59,11 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
- tb_log.add_scalar('learning_rate', cur_lr, accumulated_iter)
- for key, val in tb_dict.items():
- tb_log.add_scalar('train_' + key, val, accumulated_iter)
- + endtime = datetime.datetime.now()
- + speed = (endtime - start_time).seconds / (total_it_each_epoch - 50)
- if rank == 0:
- pbar.close()
- - return accumulated_iter
- + return accumulated_iter, speed
-
-
- def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_cfg,
- @@ -65,6 +71,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- lr_warmup_scheduler=None, ckpt_save_interval=1, max_ckpt_save_num=50,
- merge_all_iters_to_one_epoch=False):
- accumulated_iter = start_iter
- + speeds = []
- with tqdm.trange(start_epoch, total_epochs, desc='epochs', dynamic_ncols=True, leave=(rank == 0)) as tbar:
- total_it_each_epoch = len(train_loader)
- if merge_all_iters_to_one_epoch:
- @@ -82,7 +89,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- cur_scheduler = lr_warmup_scheduler
- else:
- cur_scheduler = lr_scheduler
- - accumulated_iter = train_one_epoch(
- + accumulated_iter, speed = train_one_epoch(
- model, optimizer, train_loader, model_func,
- lr_scheduler=cur_scheduler,
- accumulated_iter=accumulated_iter, optim_cfg=optim_cfg,
- @@ -91,7 +98,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- total_it_each_epoch=total_it_each_epoch,
- dataloader_iter=dataloader_iter
- )
- -
- + speeds.append(speed)
- # save trained model
- trained_epoch = cur_epoch + 1
- if trained_epoch % ckpt_save_interval == 0 and rank == 0:
- @@ -107,6 +114,8 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- save_checkpoint(
- checkpoint_state(model, optimizer, trained_epoch, accumulated_iter), filename=ckpt_name,
- )
- + print(speed)
- + print(f'*******{sum(speeds) / len(speeds)}******')
-
-
- def model_state_to_cpu(model_state):
- ```
-
-
+ ```diff
+ diff --git a/tools/train_utils/train_utils.py b/tools/train_utils/train_utils.py
+ index 91f21dd..021359d 100644
+ --- a/tools/train_utils/train_utils.py
+ +++ b/tools/train_utils/train_utils.py
+ @@ -2,6 +2,7 @@ import torch
+ import os
+ import glob
+ import tqdm
+ +import datetime
+ from torch.nn.utils import clip_grad_norm_
+
+
+ @@ -13,7 +14,10 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
+ if rank == 0:
+ pbar = tqdm.tqdm(total=total_it_each_epoch, leave=leave_pbar, desc='train', dynamic_ncols=True)
+
+ + start_time = None
+ for cur_it in range(total_it_each_epoch):
+ + if cur_it > 49 and start_time is None:
+ + start_time = datetime.datetime.now()
+ try:
+ batch = next(dataloader_iter)
+ except StopIteration:
+ @@ -55,9 +59,11 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
+ tb_log.add_scalar('learning_rate', cur_lr, accumulated_iter)
+ for key, val in tb_dict.items():
+ tb_log.add_scalar('train_' + key, val, accumulated_iter)
+ + endtime = datetime.datetime.now()
+ + speed = (endtime - start_time).seconds / (total_it_each_epoch - 50)
+ if rank == 0:
+ pbar.close()
+ - return accumulated_iter
+ + return accumulated_iter, speed
+
+
+ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_cfg,
+ @@ -65,6 +71,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ lr_warmup_scheduler=None, ckpt_save_interval=1, max_ckpt_save_num=50,
+ merge_all_iters_to_one_epoch=False):
+ accumulated_iter = start_iter
+ + speeds = []
+ with tqdm.trange(start_epoch, total_epochs, desc='epochs', dynamic_ncols=True, leave=(rank == 0)) as tbar:
+ total_it_each_epoch = len(train_loader)
+ if merge_all_iters_to_one_epoch:
+ @@ -82,7 +89,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ cur_scheduler = lr_warmup_scheduler
+ else:
+ cur_scheduler = lr_scheduler
+ - accumulated_iter = train_one_epoch(
+ + accumulated_iter, speed = train_one_epoch(
+ model, optimizer, train_loader, model_func,
+ lr_scheduler=cur_scheduler,
+ accumulated_iter=accumulated_iter, optim_cfg=optim_cfg,
+ @@ -91,7 +98,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ total_it_each_epoch=total_it_each_epoch,
+ dataloader_iter=dataloader_iter
+ )
+ -
+ + speeds.append(speed)
+ # save trained model
+ trained_epoch = cur_epoch + 1
+ if trained_epoch % ckpt_save_interval == 0 and rank == 0:
+ @@ -107,6 +114,8 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ save_checkpoint(
+ checkpoint_state(model, optimizer, trained_epoch, accumulated_iter), filename=ckpt_name,
+ )
+ + print(speed)
+ + print(f'*******{sum(speeds) / len(speeds)}******')
+
+
+ def model_state_to_cpu(model_state):
+ ```
+
+
### VoteNet
-* __MMDetection3D__: With release v0.1.0, run
+- __MMDetection3D__: With release v0.1.0, run
```bash
./tools/dist_train.sh configs/votenet/votenet_16x8_sunrgbd-3d-10class.py 8 --no-validate
```
-* __votenet__: At commit [2f6d6d3](https://github.com/facebookresearch/votenet/tree/2f6d6d36ff98d96901182e935afe48ccee82d566), run
+- __votenet__: At commit [2f6d6d3](https://github.com/facebookresearch/votenet/tree/2f6d6d36ff98d96901182e935afe48ccee82d566), run
```bash
python train.py --dataset sunrgbd --batch_size 16
```
-
Then benchmark the test speed by running
```bash
@@ -199,13 +198,13 @@ We compare the training speed (samples/s) with other codebases if they implement
### PointPillars-car
-* __MMDetection3D__: With release v0.1.0, run
+- __MMDetection3D__: With release v0.1.0, run
```bash
./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_3x8_100e_det3d_kitti-3d-car.py 8 --no-validate
```
-* __Det3D__: At commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7), use `kitti_point_pillars_mghead_syncbn.py` and run
+- __Det3D__: At commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7), use `kitti_point_pillars_mghead_syncbn.py` and run
```bash
./tools/scripts/train.sh --launcher=slurm --gpus=8
@@ -241,13 +240,13 @@ We compare the training speed (samples/s) with other codebases if they implement
### PointPillars-3class
-* __MMDetection3D__: With release v0.1.0, run
+- __MMDetection3D__: With release v0.1.0, run
```bash
./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
```
-* __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), run
+- __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), run
```bash
cd tools
@@ -258,13 +257,13 @@ We compare the training speed (samples/s) with other codebases if they implement
For SECOND, we mean the [SECONDv1.5](https://github.com/traveller59/second.pytorch/blob/master/second/configs/all.fhd.config) that was first implemented in [second.Pytorch](https://github.com/traveller59/second.pytorch). Det3D's implementation of SECOND uses its self-implemented Multi-Group Head, so its speed is not compatible with other codebases.
-* __MMDetection3D__: With release v0.1.0, run
+- __MMDetection3D__: With release v0.1.0, run
```bash
./tools/dist_train.sh configs/benchmark/hv_second_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
```
-* __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), run
+- __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), run
```bash
cd tools
@@ -273,13 +272,13 @@ For SECOND, we mean the [SECONDv1.5](https://github.com/traveller59/second.pytor
### Part-A2
-* __MMDetection3D__: With release v0.1.0, run
+- __MMDetection3D__: With release v0.1.0, run
```bash
./tools/dist_train.sh configs/benchmark/hv_PartA2_secfpn_4x8_cyclic_80e_pcdet_kitti-3d-3class.py 8 --no-validate
```
-* __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), train the model by running
+- __OpenPCDet__: At commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2), train the model by running
```bash
cd tools
diff --git a/docs/en/changelog.md b/docs/en/changelog.md
index 3ad266fd62..7f575df6f1 100644
--- a/docs/en/changelog.md
+++ b/docs/en/changelog.md
@@ -50,29 +50,28 @@ A total of 11 developers contributed to this release.
- To fix the imprecise timestamp and optimize its saving method, we reformat the point cloud data during Waymo data conversion. The data conversion time is also optimized significantly by supporting parallel processing. Please re-generate KITTI format Waymo data if necessary. See more details in the [compatibility documentation](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/compatibility.md).
- We update some of the model checkpoints after the refactor of coordinate systems. Please stay tuned for the release of the remaining model checkpoints.
-| | Fully Updated | Partially Updated | In Progress | No Influcence |
-|--------------------|:-------------:|:--------:| :-----------: | :-----------: |
-| SECOND | | ✓ | | |
-| PointPillars | | ✓ | | |
-| FreeAnchor | ✓ | | | |
-| VoteNet | ✓ | | | |
-| H3DNet | ✓ | | | |
-| 3DSSD | | ✓ | | |
-| Part-A2 | ✓ | | | |
-| MVXNet | ✓ | | | |
-| CenterPoint | | |✓ | |
-| SSN | ✓ | | | |
-| ImVoteNet | ✓ | | | |
-| FCOS3D | | | |✓ |
-| PointNet++ | | | |✓ |
-| Group-Free-3D | | | |✓ |
-| ImVoxelNet | ✓ | | | |
-| PAConv | | | |✓ |
-| DGCNN | | | |✓ |
-| SMOKE | | | |✓ |
-| PGD | | | |✓ |
-| MonoFlex | | | |✓ |
-
+| | Fully Updated | Partially Updated | In Progress | No Influcence |
+| ------------- | :-----------: | :---------------: | :---------: | :-----------: |
+| SECOND | | ✓ | | |
+| PointPillars | | ✓ | | |
+| FreeAnchor | ✓ | | | |
+| VoteNet | ✓ | | | |
+| H3DNet | ✓ | | | |
+| 3DSSD | | ✓ | | |
+| Part-A2 | ✓ | | | |
+| MVXNet | ✓ | | | |
+| CenterPoint | | | ✓ | |
+| SSN | ✓ | | | |
+| ImVoteNet | ✓ | | | |
+| FCOS3D | | | | ✓ |
+| PointNet++ | | | | ✓ |
+| Group-Free-3D | | | | ✓ |
+| ImVoxelNet | ✓ | | | |
+| PAConv | | | | ✓ |
+| DGCNN | | | | ✓ |
+| SMOKE | | | | ✓ |
+| PGD | | | | ✓ |
+| MonoFlex | | | | ✓ |
#### Highlights
@@ -414,7 +413,6 @@ A total of 12 developers contributed to this release.
@yinchimaoliang, @gopi231091, @filaPro, @ZwwWayne, @ZCMax, @hjin2902, @wHao-Wu, @Wuziyi616, @xiliu8006, @THU17cyz, @DCNSW, @Tai-Wang
-
### v0.15.0 (1/7/2021)
#### Compatibility
@@ -449,7 +447,6 @@ In order to fix the problem that the priority of EvalHook is too low, all hook p
- Add documentation for vision-only 3D detection (#669)
- Refine docs for Quick Run and Useful Tools (#686)
-
#### Bug Fixes
- Fix the bug of [BackgroundPointsFilter](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/pipelines/transforms_3d.py) using the bottom center of ground truth (#609)
@@ -458,10 +455,10 @@ In order to fix the problem that the priority of EvalHook is too low, all hook p
- Fix test commands in docs and make some refinements (#635)
- Fix wrong config paths in unit tests (#641)
-
### v0.14.0 (1/6/2021)
#### Highlights
+
- Support the point cloud segmentation method [PointNet++](https://arxiv.org/abs/1706.02413)
#### New Features
@@ -482,16 +479,17 @@ In order to fix the problem that the priority of EvalHook is too low, all hook p
- Remove a useless parameter `label_weight` from segmentation datasets including `Custom3DSegDataset`, `ScanNetSegDataset` and `S3DISSegDataset` (#607)
#### Bug Fixes
+
- Fix a corrupted lidar data file in Lyft dataset in [data_preparation](https://github.com/open-mmlab/mmdetection3d/tree/master/docs/data_preparation.md) (#546)
- Fix evaluation bugs in nuScenes and Lyft dataset (#549)
- Fix converting points between coordinates with specific transformation matrix in the [coord_3d_mode.py](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/coord_3d_mode.py) (#556)
- Support PointPillars models on Lyft dataset (#578)
- Fix the bug of demo with pre-trained VoteNet model on ScanNet (#600)
-
### v0.13.0 (1/5/2021)
#### Highlights
+
- Support a monocular 3D detection method [FCOS3D](https://arxiv.org/abs/2104.10956)
- Support ScanNet and S3DIS semantic segmentation dataset
- Enhancement of visualization tools for dataset browsing and demos, including support of visualization for multi-modality data and point cloud segmentation.
@@ -746,7 +744,7 @@ In order to fix the problem that the priority of EvalHook is too low, all hook p
- Support Batch Inference (#95, #103, #116): MMDetection3D v0.6.0 migrates to support batch inference based on MMDetection >= v2.4.0. This change influences all the test APIs in MMDetection3D and downstream codebases.
- Start to use collect environment function from MMCV (#113): MMDetection3D v0.6.0 migrates to use `collect_env` function in MMCV.
-`get_compiler_version` and `get_compiling_cuda_version` compiled in `mmdet3d.ops.utils` are removed. Please import these two functions from `mmcv.ops`.
+ `get_compiler_version` and `get_compiling_cuda_version` compiled in `mmdet3d.ops.utils` are removed. Please import these two functions from `mmcv.ops`.
#### New Features
diff --git a/docs/en/compatibility.md b/docs/en/compatibility.md
index fc468c53a4..6b11fc8cd0 100644
--- a/docs/en/compatibility.md
+++ b/docs/en/compatibility.md
@@ -10,46 +10,45 @@ In this version we did a major code refactoring that boosted the performance of
Meanwhile, we also fixed the imprecise timestamps saving issue in waymo dataset conversion. This change introduces following backward compatibility breaks:
- The point cloud .bin files of waymo dataset need to be regenerated.
-In the .bin files each point occupies 6 `float32` and the meaning of the last `float32` now changed from **imprecise timestamps** to **range frame offset**.
-The **range frame offset** for each point is calculated as`ri * h * w + row * w + col` if the point is from the **TOP** lidar or `-1` otherwise.
-The `h`, `w` denote the height and width of the TOP lidar's range frame.
-The `ri`, `row`, `col` denote the return index, the row and the column of the range frame where each point locates.
-Following tables show the difference across the change:
+ In the .bin files each point occupies 6 `float32` and the meaning of the last `float32` now changed from **imprecise timestamps** to **range frame offset**.
+ The **range frame offset** for each point is calculated as`ri * h * w + row * w + col` if the point is from the **TOP** lidar or `-1` otherwise.
+ The `h`, `w` denote the height and width of the TOP lidar's range frame.
+ The `ri`, `row`, `col` denote the return index, the row and the column of the range frame where each point locates.
+ Following tables show the difference across the change:
Before
| Element offset (float32) | 0 | 1 | 2 | 3 | 4 | 5 |
-|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:-----------------------:|
+| ------------------------ | :-: | :-: | :-: | :-------: | :--------: | :---------------------: |
| Bytes offset | 0 | 4 | 8 | 12 | 16 | 20 |
| Meaning | x | y | z | intensity | elongation | **imprecise timestamp** |
After
| Element offset (float32) | 0 | 1 | 2 | 3 | 4 | 5 |
-|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:----------------------:|
+| ------------------------ | :-: | :-: | :-: | :-------: | :--------: | :--------------------: |
| Bytes offset | 0 | 4 | 8 | 12 | 16 | 20 |
| Meaning | x | y | z | intensity | elongation | **range frame offset** |
- The objects' point cloud .bin files in the GT-database of waymo dataset need to be regenerated because we also dumped the range frame offset for each point into it.
-Following tables show the difference across the change:
+ Following tables show the difference across the change:
Before
| Element offset (float32) | 0 | 1 | 2 | 3 | 4 |
-|--------------------------|:---:|:---:|:---:|:---------:|:----------:|
+| ------------------------ | :-: | :-: | :-: | :-------: | :--------: |
| Bytes offset | 0 | 4 | 8 | 12 | 16 |
| Meaning | x | y | z | intensity | elongation |
After
| Element offset (float32) | 0 | 1 | 2 | 3 | 4 | 5 |
-|--------------------------|:---:|:---:|:---:|:---------:|:----------:|:----------------------:|
+| ------------------------ | :-: | :-: | :-: | :-------: | :--------: | :--------------------: |
| Bytes offset | 0 | 4 | 8 | 12 | 16 | 20 |
| Meaning | x | y | z | intensity | elongation | **range frame offset** |
- Any configuration that uses waymo dataset with GT Augmentation should change the `db_sampler.points_loader.load_dim` from `5` to `6`.
-
## v1.0.0rc0
### Coordinate system refactoring
@@ -63,6 +62,7 @@ In this version, we did a major code refactoring which improved the consistency
#### ***NOTICE!!***
Since definitions of box representation have changed, the annotation data of most datasets require updating:
+
- SUN RGB-D: Yaw angles in the annotation should be reversed.
- KITTI: For LiDAR boxes in GT databases, (x_size, y_size, z_size, yaw) out of (x, y, z, x_size, y_size, z_size) should be converted from the old LiDAR coordinate system to the new one. The training/validation data annotations should be left unchanged since they are under the Camera coordinate system, which is unmodified after the refactoring.
- Waymo: Same as KITTI.
@@ -88,7 +88,6 @@ Functions only involving points are generally unaffected except if they rely on
- Data augmentation utils in [data_augment_utils.py](https://github.com/open-mmlab/mmdetection3d/blob/v1.0.0rc0/mmdet3d/datasets/pipelines/data_augment_utils.py) now follow the rules of a right-handed system.
- We do not need the yaw hacking in KITTI anymore after refining [`get_direction_target`](https://github.com/open-mmlab/mmdetection3d/blob/v1.0.0rc0/mmdet3d/models/dense_heads/train_mixins.py). Interested users may refer to PR [#677](https://github.com/open-mmlab/mmdetection3d/pull/677) .
-
## 0.16.0
### Returned values of `QueryAndGroup` operation
@@ -168,4 +167,4 @@ Please refer to the SUNRGBD [README.md](https://github.com/open-mmlab/mmdetectio
### VoteNet and H3DNet model structure update
-In MMDetection 0.6.0, we updated the model structures of VoteNet and H3DNet, therefore model checkpoints generated by MMDetection < 0.6.0 should be first converted to a format compatible with the latest structures via [convert_votenet_checkpoints.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/model_converters/convert_votenet_checkpoints.py) and [convert_h3dnet_checkpoints.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/model_converters/convert_h3dnet_checkpoints.py) . For more details, please refer to the VoteNet [README.md](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/votenet/README.md/) and H3DNet [README.md](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/h3dnet/README.md/).
+In MMDetection 0.6.0, we updated the model structures of VoteNet and H3DNet, therefore model checkpoints generated by MMDetection \< 0.6.0 should be first converted to a format compatible with the latest structures via [convert_votenet_checkpoints.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/model_converters/convert_votenet_checkpoints.py) and [convert_h3dnet_checkpoints.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/model_converters/convert_h3dnet_checkpoints.py) . For more details, please refer to the VoteNet [README.md](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/votenet/README.md/) and H3DNet [README.md](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/h3dnet/README.md/).
diff --git a/docs/en/datasets/kitti_det.md b/docs/en/datasets/kitti_det.md
index 22cb9db107..c0eaac9b61 100644
--- a/docs/en/datasets/kitti_det.md
+++ b/docs/en/datasets/kitti_det.md
@@ -88,27 +88,27 @@ kitti
- `kitti_gt_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset
- `kitti_infos_train.pkl`: training dataset infos, each frame info contains following details:
- - info['point_cloud']: {'num_features': 4, 'velodyne_path': velodyne_path}.
- - info['annos']: {
- - location: x,y,z are bottom center in referenced camera coordinate system (in meters), an Nx3 array
- - dimensions: height, width, length (in meters), an Nx3 array
- - rotation_y: rotation ry around Y-axis in camera coordinates [-pi..pi], an N array
- - name: ground truth name array, an N array
- - difficulty: kitti difficulty, Easy, Moderate, Hard
- - group_ids: used for multi-part object
- }
- - (optional) info['calib']: {
- - P0: camera0 projection matrix after rectification, an 3x4 array
- - P1: camera1 projection matrix after rectification, an 3x4 array
- - P2: camera2 projection matrix after rectification, an 3x4 array
- - P3: camera3 projection matrix after rectification, an 3x4 array
- - R0_rect: rectifying rotation matrix, an 4x4 array
- - Tr_velo_to_cam: transformation from Velodyne coordinate to camera coordinate, an 4x4 array
- - Tr_imu_to_velo: transformation from IMU coordinate to Velodyne coordinate, an 4x4 array
- }
- - (optional) info['image']:{'image_idx': idx, 'image_path': image_path, 'image_shape', image_shape}.
-
-**Note:** the info['annos'] is in the referenced camera coordinate system. More details please refer to [this](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)
+ - info\['point_cloud'\]: {'num_features': 4, 'velodyne_path': velodyne_path}.
+ - info\['annos'\]: {
+ - location: x,y,z are bottom center in referenced camera coordinate system (in meters), an Nx3 array
+ - dimensions: height, width, length (in meters), an Nx3 array
+ - rotation_y: rotation ry around Y-axis in camera coordinates \[-pi..pi\], an N array
+ - name: ground truth name array, an N array
+ - difficulty: kitti difficulty, Easy, Moderate, Hard
+ - group_ids: used for multi-part object
+ }
+ - (optional) info\['calib'\]: {
+ - P0: camera0 projection matrix after rectification, an 3x4 array
+ - P1: camera1 projection matrix after rectification, an 3x4 array
+ - P2: camera2 projection matrix after rectification, an 3x4 array
+ - P3: camera3 projection matrix after rectification, an 3x4 array
+ - R0_rect: rectifying rotation matrix, an 4x4 array
+ - Tr_velo_to_cam: transformation from Velodyne coordinate to camera coordinate, an 4x4 array
+ - Tr_imu_to_velo: transformation from IMU coordinate to Velodyne coordinate, an 4x4 array
+ }
+ - (optional) info\['image'\]:{'image_idx': idx, 'image_path': image_path, 'image_shape', image_shape}.
+
+**Note:** the info\['annos'\] is in the referenced camera coordinate system. More details please refer to [this](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)
The core function to get kitti_infos_xxx.pkl and kitti_infos_xxx_mono3d.coco.json are [get_kitti_image_info](https://github.com/open-mmlab/mmdetection3d/blob/7873c8f62b99314f35079f369d1dab8d63f8a3ce/tools/data_converter/kitti_data_utils.py#L140) and [get_2d_boxes](https://github.com/open-mmlab/mmdetection3d/blob/7873c8f62b99314f35079f369d1dab8d63f8a3ce/tools/data_converter/kitti_converter.py#L378). Please refer to [kitti_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/7873c8f62b99314f35079f369d1dab8d63f8a3ce/tools/data_converter/kitti_converter.py) for more details.
@@ -150,9 +150,9 @@ train_pipeline = [
```
- Data augmentation:
- - `ObjectNoise`: apply noise to each GT objects in the scene.
- - `RandomFlip3D`: randomly flip input point cloud horizontally or vertically.
- - `GlobalRotScaleTrans`: rotate input point cloud.
+ - `ObjectNoise`: apply noise to each GT objects in the scene.
+ - `RandomFlip3D`: randomly flip input point cloud horizontally or vertically.
+ - `GlobalRotScaleTrans`: rotate input point cloud.
## Evaluation
diff --git a/docs/en/datasets/lyft_det.md b/docs/en/datasets/lyft_det.md
index 686f346e88..3bc1927847 100644
--- a/docs/en/datasets/lyft_det.md
+++ b/docs/en/datasets/lyft_det.md
@@ -90,21 +90,21 @@ Next, we will elaborate on the difference compared to nuScenes in terms of the d
- without `lyft_database/xxxxx.bin`: This folder and `.bin` files are not extracted on the Lyft dataset due to the negligible effect of ground-truth sampling in the experiments.
- `lyft_infos_train.pkl`: training dataset infos, each frame info has two keys: `metadata` and `infos`.
-`metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.01-train'}`, while `infos` contains the detailed information the same as nuScenes except for the following details:
- - info['sweeps']: Sweeps information.
- - info['sweeps'][i]['type']: The sweep data type, e.g., `'lidar'`.
- Lyft has different LiDAR settings for some samples, but we always take only the points collected by the top LiDAR for the consistency of data distribution.
- - info['gt_names']: There are 9 categories on the Lyft dataset, and the imbalance of annotations for different categories is even more significant than nuScenes.
- - without info['gt_velocity']: There is no velocity measurement on Lyft.
- - info['num_lidar_pts']: Set to -1 by default.
- - info['num_radar_pts']: Set to 0 by default.
- - without info['valid_flag']: This flag does recorded due to invalid `num_lidar_pts` and `num_radar_pts`.
+ `metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.01-train'}`, while `infos` contains the detailed information the same as nuScenes except for the following details:
+ - info\['sweeps'\]: Sweeps information.
+ - info\['sweeps'\]\[i\]\['type'\]: The sweep data type, e.g., `'lidar'`.
+ Lyft has different LiDAR settings for some samples, but we always take only the points collected by the top LiDAR for the consistency of data distribution.
+ - info\['gt_names'\]: There are 9 categories on the Lyft dataset, and the imbalance of annotations for different categories is even more significant than nuScenes.
+ - without info\['gt_velocity'\]: There is no velocity measurement on Lyft.
+ - info\['num_lidar_pts'\]: Set to -1 by default.
+ - info\['num_radar_pts'\]: Set to 0 by default.
+ - without info\['valid_flag'\]: This flag does recorded due to invalid `num_lidar_pts` and `num_radar_pts`.
- `nuscenes_infos_train_mono3d.coco.json`: training dataset coco-style info. This file only contains 2D information, without the information required by 3D detection, such as camera intrinsics.
- - info['images']: A list containing all the image info.
- - only containing `'file_name'`, `'id'`, `'width'`, `'height'`.
- - info['annotations']: A list containing all the annotation info.
- - only containing `'file_name'`, `'image_id'`, `'area'`, `'category_name'`, `'category_id'`, `'bbox'`, `'is_crowd'`, `'segmentation'`, `'id'`, where `'is_crowd'`, `'segmentation'` are set to `0` and `[]` by default.
- There is no attribute annotation on Lyft.
+ - info\['images'\]: A list containing all the image info.
+ - only containing `'file_name'`, `'id'`, `'width'`, `'height'`.
+ - info\['annotations'\]: A list containing all the annotation info.
+ - only containing `'file_name'`, `'image_id'`, `'area'`, `'category_name'`, `'category_id'`, `'bbox'`, `'is_crowd'`, `'segmentation'`, `'id'`, where `'is_crowd'`, `'segmentation'` are set to `0` and `[]` by default.
+ There is no attribute annotation on Lyft.
Here we only explain the data recorded in the training info files. The same applies to the testing set.
diff --git a/docs/en/datasets/nuscenes_det.md b/docs/en/datasets/nuscenes_det.md
index 76995c7edd..60e1935124 100644
--- a/docs/en/datasets/nuscenes_det.md
+++ b/docs/en/datasets/nuscenes_det.md
@@ -62,63 +62,63 @@ Next, we will elaborate on the details recorded in these info files.
- `nuscenes_database/xxxxx.bin`: point cloud data included in each 3D bounding box of the training dataset
- `nuscenes_infos_train.pkl`: training dataset info, each frame info has two keys: `metadata` and `infos`.
-`metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.0-trainval'}`, while `infos` contains the detailed information as follows:
- - info['lidar_path']: The file path of the lidar point cloud data.
- - info['token']: Sample data token.
- - info['sweeps']: Sweeps information (`sweeps` in the nuScenes refer to the intermediate frames without annotations, while `samples` refer to those key frames with annotations).
- - info['sweeps'][i]['data_path']: The data path of i-th sweep.
- - info['sweeps'][i]['type']: The sweep data type, e.g., `'lidar'`.
- - info['sweeps'][i]['sample_data_token']: The sweep sample data token.
- - info['sweeps'][i]['sensor2ego_translation']: The translation from the current sensor (for collecting the sweep data) to ego vehicle. (1x3 list)
- - info['sweeps'][i]['sensor2ego_rotation']: The rotation from the current sensor (for collecting the sweep data) to ego vehicle. (1x4 list in the quaternion format)
- - info['sweeps'][i]['ego2global_translation']: The translation from the ego vehicle to global coordinates. (1x3 list)
- - info['sweeps'][i]['ego2global_rotation']: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
- - info['sweeps'][i]['timestamp']: Timestamp of the sweep data.
- - info['sweeps'][i]['sensor2lidar_translation']: The translation from the current sensor (for collecting the sweep data) to lidar. (1x3 list)
- - info['sweeps'][i]['sensor2lidar_rotation']: The rotation from the current sensor (for collecting the sweep data) to lidar. (1x4 list in the quaternion format)
- - info['cams']: Cameras calibration information. It contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`.
+ `metadata` contains the basic information for the dataset itself, such as `{'version': 'v1.0-trainval'}`, while `infos` contains the detailed information as follows:
+ - info\['lidar_path'\]: The file path of the lidar point cloud data.
+ - info\['token'\]: Sample data token.
+ - info\['sweeps'\]: Sweeps information (`sweeps` in the nuScenes refer to the intermediate frames without annotations, while `samples` refer to those key frames with annotations).
+ - info\['sweeps'\]\[i\]\['data_path'\]: The data path of i-th sweep.
+ - info\['sweeps'\]\[i\]\['type'\]: The sweep data type, e.g., `'lidar'`.
+ - info\['sweeps'\]\[i\]\['sample_data_token'\]: The sweep sample data token.
+ - info\['sweeps'\]\[i\]\['sensor2ego_translation'\]: The translation from the current sensor (for collecting the sweep data) to ego vehicle. (1x3 list)
+ - info\['sweeps'\]\[i\]\['sensor2ego_rotation'\]: The rotation from the current sensor (for collecting the sweep data) to ego vehicle. (1x4 list in the quaternion format)
+ - info\['sweeps'\]\[i\]\['ego2global_translation'\]: The translation from the ego vehicle to global coordinates. (1x3 list)
+ - info\['sweeps'\]\[i\]\['ego2global_rotation'\]: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
+ - info\['sweeps'\]\[i\]\['timestamp'\]: Timestamp of the sweep data.
+ - info\['sweeps'\]\[i\]\['sensor2lidar_translation'\]: The translation from the current sensor (for collecting the sweep data) to lidar. (1x3 list)
+ - info\['sweeps'\]\[i\]\['sensor2lidar_rotation'\]: The rotation from the current sensor (for collecting the sweep data) to lidar. (1x4 list in the quaternion format)
+ - info\['cams'\]: Cameras calibration information. It contains six keys corresponding to each camera: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`.
Each dictionary contains detailed information following the above way for each sweep data (has the same keys for each information as above). In addition, each camera has a key `'cam_intrinsic'` for recording the intrinsic parameters when projecting 3D points to each image plane.
- - info['lidar2ego_translation']: The translation from lidar to ego vehicle. (1x3 list)
- - info['lidar2ego_rotation']: The rotation from lidar to ego vehicle. (1x4 list in the quaternion format)
- - info['ego2global_translation']: The translation from the ego vehicle to global coordinates. (1x3 list)
- - info['ego2global_rotation']: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
- - info['timestamp']: Timestamp of the sample data.
- - info['gt_boxes']: 7-DoF annotations of 3D bounding boxes, an Nx7 array.
- - info['gt_names']: Categories of 3D bounding boxes, an 1xN array.
- - info['gt_velocity']: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), an Nx2 array.
- - info['num_lidar_pts']: Number of lidar points included in each 3D bounding box.
- - info['num_radar_pts']: Number of radar points included in each 3D bounding box.
- - info['valid_flag']: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
+ - info\['lidar2ego_translation'\]: The translation from lidar to ego vehicle. (1x3 list)
+ - info\['lidar2ego_rotation'\]: The rotation from lidar to ego vehicle. (1x4 list in the quaternion format)
+ - info\['ego2global_translation'\]: The translation from the ego vehicle to global coordinates. (1x3 list)
+ - info\['ego2global_rotation'\]: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
+ - info\['timestamp'\]: Timestamp of the sample data.
+ - info\['gt_boxes'\]: 7-DoF annotations of 3D bounding boxes, an Nx7 array.
+ - info\['gt_names'\]: Categories of 3D bounding boxes, an 1xN array.
+ - info\['gt_velocity'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), an Nx2 array.
+ - info\['num_lidar_pts'\]: Number of lidar points included in each 3D bounding box.
+ - info\['num_radar_pts'\]: Number of radar points included in each 3D bounding box.
+ - info\['valid_flag'\]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.
- `nuscenes_infos_train_mono3d.coco.json`: training dataset coco-style info. This file organizes image-based data into three categories (keys): `'categories'`, `'images'`, `'annotations'`.
- - info['categories']: A list containing all the category names. Each element follows the dictionary format and consists of two keys: `'id'` and `'name'`.
- - info['images']: A list containing all the image info.
- - info['images'][i]['file_name']: The file name of the i-th image.
- - info['images'][i]['id']: Sample data token of the i-th image.
- - info['images'][i]['token']: Sample token corresponding to this frame.
- - info['images'][i]['cam2ego_rotation']: The rotation from the camera to ego vehicle. (1x4 list in the quaternion format)
- - info['images'][i]['cam2ego_translation']: The translation from the camera to ego vehicle. (1x3 list)
- - info['images'][i]['ego2global_rotation'']: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
- - info['images'][i]['ego2global_translation']: The translation from the ego vehicle to global coordinates. (1x3 list)
- - info['images'][i]['cam_intrinsic']: Camera intrinsic matrix. (3x3 list)
- - info['images'][i]['width']: Image width, 1600 by default in nuScenes.
- - info['images'][i]['height']: Image height, 900 by default in nuScenes.
- - info['annotations']: A list containing all the annotation info.
- - info['annotations'][i]['file_name']: The file name of the corresponding image.
- - info['annotations'][i]['image_id']: The image id (token) of the corresponding image.
- - info['annotations'][i]['area']: Area of the 2D bounding box.
- - info['annotations'][i]['category_name']: Category name.
- - info['annotations'][i]['category_id']: Category id.
- - info['annotations'][i]['bbox']: 2D bounding box annotation (exterior rectangle of the projected 3D box), 1x4 list following [x1, y1, x2-x1, y2-y1].
- x1/y1 are minimum coordinates along horizontal/vertical direction of the image.
- - info['annotations'][i]['iscrowd']: Whether the region is crowded. Defaults to 0.
- - info['annotations'][i]['bbox_cam3d']: 3D bounding box (gravity) center location (3), size (3), (global) yaw angle (1), 1x7 list.
- - info['annotations'][i]['velo_cam3d']: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), an Nx2 array.
- - info['annotations'][i]['center2d']: Projected 3D-center containing 2.5D information: projected center location on the image (2) and depth (1), 1x3 list.
- - info['annotations'][i]['attribute_name']: Attribute name.
- - info['annotations'][i]['attribute_id']: Attribute id.
- We maintain a default attribute collection and mapping for attribute classification.
- Please refer to [here](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/nuscenes_mono_dataset.py#L53) for more details.
- - info['annotations'][i]['id']: Annotation id. Defaults to `i`.
+ - info\['categories'\]: A list containing all the category names. Each element follows the dictionary format and consists of two keys: `'id'` and `'name'`.
+ - info\['images'\]: A list containing all the image info.
+ - info\['images'\]\[i\]\['file_name'\]: The file name of the i-th image.
+ - info\['images'\]\[i\]\['id'\]: Sample data token of the i-th image.
+ - info\['images'\]\[i\]\['token'\]: Sample token corresponding to this frame.
+ - info\['images'\]\[i\]\['cam2ego_rotation'\]: The rotation from the camera to ego vehicle. (1x4 list in the quaternion format)
+ - info\['images'\]\[i\]\['cam2ego_translation'\]: The translation from the camera to ego vehicle. (1x3 list)
+ - info\['images'\]\[i\]\['ego2global_rotation''\]: The rotation from the ego vehicle to global coordinates. (1x4 list in the quaternion format)
+ - info\['images'\]\[i\]\['ego2global_translation'\]: The translation from the ego vehicle to global coordinates. (1x3 list)
+ - info\['images'\]\[i\]\['cam_intrinsic'\]: Camera intrinsic matrix. (3x3 list)
+ - info\['images'\]\[i\]\['width'\]: Image width, 1600 by default in nuScenes.
+ - info\['images'\]\[i\]\['height'\]: Image height, 900 by default in nuScenes.
+ - info\['annotations'\]: A list containing all the annotation info.
+ - info\['annotations'\]\[i\]\['file_name'\]: The file name of the corresponding image.
+ - info\['annotations'\]\[i\]\['image_id'\]: The image id (token) of the corresponding image.
+ - info\['annotations'\]\[i\]\['area'\]: Area of the 2D bounding box.
+ - info\['annotations'\]\[i\]\['category_name'\]: Category name.
+ - info\['annotations'\]\[i\]\['category_id'\]: Category id.
+ - info\['annotations'\]\[i\]\['bbox'\]: 2D bounding box annotation (exterior rectangle of the projected 3D box), 1x4 list following \[x1, y1, x2-x1, y2-y1\].
+ x1/y1 are minimum coordinates along horizontal/vertical direction of the image.
+ - info\['annotations'\]\[i\]\['iscrowd'\]: Whether the region is crowded. Defaults to 0.
+ - info\['annotations'\]\[i\]\['bbox_cam3d'\]: 3D bounding box (gravity) center location (3), size (3), (global) yaw angle (1), 1x7 list.
+ - info\['annotations'\]\[i\]\['velo_cam3d'\]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), an Nx2 array.
+ - info\['annotations'\]\[i\]\['center2d'\]: Projected 3D-center containing 2.5D information: projected center location on the image (2) and depth (1), 1x3 list.
+ - info\['annotations'\]\[i\]\['attribute_name'\]: Attribute name.
+ - info\['annotations'\]\[i\]\['attribute_id'\]: Attribute id.
+ We maintain a default attribute collection and mapping for attribute classification.
+ Please refer to [here](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/nuscenes_mono_dataset.py#L53) for more details.
+ - info\['annotations'\]\[i\]\['id'\]: Annotation id. Defaults to `i`.
Here we only explain the data recorded in the training info files. The same applies to validation and testing set.
@@ -194,10 +194,11 @@ train_pipeline = [
```
It follows the general pipeline of 2D detection while differs in some details:
+
- It uses monocular pipelines to load images, which includes additional required information like camera intrinsics.
- It needs to load 3D annotations.
- Some data augmentation techniques need to be adjusted, such as `RandomFlip3D`.
-Currently we do not support more augmentation methods, because how to transfer and apply other techniques is still under explored.
+ Currently we do not support more augmentation methods, because how to transfer and apply other techniques is still under explored.
## Evaluation
diff --git a/docs/en/datasets/s3dis_sem_seg.md b/docs/en/datasets/s3dis_sem_seg.md
index 5ff4d7e555..d11162f615 100644
--- a/docs/en/datasets/s3dis_sem_seg.md
+++ b/docs/en/datasets/s3dis_sem_seg.md
@@ -36,10 +36,12 @@ mmdetection3d
Under folder `Stanford3dDataset_v1.2_Aligned_Version`, the rooms are spilted into 6 areas. We use 5 areas for training and 1 for evaluation (typically `Area_5`). Under the directory of each area, there are folders in which raw point cloud data and relevant annotations are saved. For instance, under folder `Area_1/office_1` the files are as below:
- `office_1.txt`: A txt file storing coordinates and colors of each point in the raw point cloud data.
+
- `Annotations/`: This folder contains txt files for different object instances. Each txt file represents one instance, e.g.
- - `chair_1.txt`: A txt file storing raw point cloud data of one chair in this room.
- If we concat all the txt files under `Annotations/`, we will get the same point cloud as denoted by `office_1.txt`.
+ - `chair_1.txt`: A txt file storing raw point cloud data of one chair in this room.
+
+ If we concat all the txt files under `Annotations/`, we will get the same point cloud as denoted by `office_1.txt`.
Export S3DIS data by running `python collect_indoor3d_data.py`. The main steps include:
@@ -138,16 +140,16 @@ s3dis
```
- `points/xxxxx.bin`: The exported point cloud data.
-- `instance_mask/xxxxx.bin`: The instance label for each point, value range: [0, ${NUM_INSTANCES}], 0: unannotated.
-- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: [0, 12].
+- `instance_mask/xxxxx.bin`: The instance label for each point, value range: \[0, ${NUM_INSTANCES}\], 0: unannotated.
+- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[0, 12\].
- `s3dis_infos_Area_1.pkl`: Area 1 data infos, the detailed info of each room is as follows:
- - info['point_cloud']: {'num_features': 6, 'lidar_idx': sample_idx}.
- - info['pts_path']: The path of `points/xxxxx.bin`.
- - info['pts_instance_mask_path']: The path of `instance_mask/xxxxx.bin`.
- - info['pts_semantic_mask_path']: The path of `semantic_mask/xxxxx.bin`.
+ - info\['point_cloud'\]: {'num_features': 6, 'lidar_idx': sample_idx}.
+ - info\['pts_path'\]: The path of `points/xxxxx.bin`.
+ - info\['pts_instance_mask_path'\]: The path of `instance_mask/xxxxx.bin`.
+ - info\['pts_semantic_mask_path'\]: The path of `semantic_mask/xxxxx.bin`.
- `seg_info`: The generated infos to support semantic segmentation model training.
- - `Area_1_label_weight.npy`: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it's a common practice to use label re-weighting to get a better performance.
- - `Area_1_resampled_scene_idxs.npy`: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.
+ - `Area_1_label_weight.npy`: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it's a common practice to use label re-weighting to get a better performance.
+ - `Area_1_resampled_scene_idxs.npy`: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.
## Training pipeline
@@ -200,13 +202,13 @@ train_pipeline = [
]
```
-- `PointSegClassMapping`: Only the valid category ids will be mapped to class label ids like [0, 13) during training. Other class ids will be converted to `ignore_index` which equals to `13`.
+- `PointSegClassMapping`: Only the valid category ids will be mapped to class label ids like \[0, 13) during training. Other class ids will be converted to `ignore_index` which equals to `13`.
- `IndoorPatchPointSample`: Crop a patch containing a fixed number of points from input point cloud. `block_size` indicates the size of the cropped block, typically `1.0` for S3DIS.
- `NormalizePointsColor`: Normalize the RGB color values of input point cloud by dividing `255`.
- Data augmentation:
- - `GlobalRotScaleTrans`: randomly rotate and scale input point cloud.
- - `RandomJitterPoints`: randomly jitter point cloud by adding different noise vector to each point.
- - `RandomDropPointsColor`: set the colors of point cloud to all zeros by a probability `drop_ratio`.
+ - `GlobalRotScaleTrans`: randomly rotate and scale input point cloud.
+ - `RandomJitterPoints`: randomly jitter point cloud by adding different noise vector to each point.
+ - `RandomDropPointsColor`: set the colors of point cloud to all zeros by a probability `drop_ratio`.
## Metrics
diff --git a/docs/en/datasets/scannet_det.md b/docs/en/datasets/scannet_det.md
index 6e35dfd9d4..540c8ca7f3 100644
--- a/docs/en/datasets/scannet_det.md
+++ b/docs/en/datasets/scannet_det.md
@@ -135,7 +135,7 @@ By exporting ScanNet RGB data, for each scene we load a set of RGB images with c
python extract_posed_images.py
```
-Each of 1201 train, 312 validation and 100 test scenes contains a single `.sens` file. For instance, for scene `0001_01` we have `data/scannet/scans/scene0001_01/0001_01.sens`. For this scene all images and poses are extracted to `data/scannet/posed_images/scene0001_01`. Specifically, there will be 300 image files xxxxx.jpg, 300 camera pose files xxxxx.txt and a single `intrinsic.txt` file. Typically, single scene contains several thousand images. By default, we extract only 300 of them with resulting space occupation of <100 Gb. To extract more images, use `--max-images-per-scene` parameter.
+Each of 1201 train, 312 validation and 100 test scenes contains a single `.sens` file. For instance, for scene `0001_01` we have `data/scannet/scans/scene0001_01/0001_01.sens`. For this scene all images and poses are extracted to `data/scannet/posed_images/scene0001_01`. Specifically, there will be 300 image files xxxxx.jpg, 300 camera pose files xxxxx.txt and a single `intrinsic.txt` file. Typically, single scene contains several thousand images. By default, we extract only 300 of them with resulting space occupation of \<100 Gb. To extract more images, use `--max-images-per-scene` parameter.
### Create dataset
@@ -222,29 +222,28 @@ scannet
```
- `points/xxxxx.bin`: The `axis-unaligned` point cloud data after downsample. Since ScanNet 3D detection task takes axis-aligned point clouds as input, while ScanNet 3D semantic segmentation task takes unaligned points, we choose to store unaligned points and their axis-align transform matrix. Note: the points would be axis-aligned in pre-processing pipeline [`GlobalAlignment`](https://github.com/open-mmlab/mmdetection3d/blob/9f0b01caf6aefed861ef4c3eb197c09362d26b32/mmdet3d/datasets/pipelines/transforms_3d.py#L423) of 3D detection task.
-- `instance_mask/xxxxx.bin`: The instance label for each point, value range: [0, NUM_INSTANCES], 0: unannotated.
-- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: [1, 40], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`.
+- `instance_mask/xxxxx.bin`: The instance label for each point, value range: \[0, NUM_INSTANCES\], 0: unannotated.
+- `semantic_mask/xxxxx.bin`: The semantic label for each point, value range: \[1, 40\], i.e. `nyu40id` standard. Note: the `nyu40id` ID will be mapped to train ID in train pipeline `PointSegClassMapping`.
- `posed_images/scenexxxx_xx`: The set of `.jpg` images with `.txt` 4x4 poses and the single `.txt` file with camera intrinsic matrix.
- `scannet_infos_train.pkl`: The train data infos, the detailed info of each scan is as follows:
- - info['point_cloud']: {'num_features': 6, 'lidar_idx': sample_idx}.
- - info['pts_path']: The path of `points/xxxxx.bin`.
- - info['pts_instance_mask_path']: The path of `instance_mask/xxxxx.bin`.
- - info['pts_semantic_mask_path']: The path of `semantic_mask/xxxxx.bin`.
- - info['annos']: The annotations of each scan.
- - annotations['gt_num']: The number of ground truths.
- - annotations['name']: The semantic name of all ground truths, e.g. `chair`.
- - annotations['location']: The gravity center of the axis-aligned 3D bounding boxes in depth coordinate system. Shape: [K, 3], K is the number of ground truths.
- - annotations['dimensions']: The dimensions of the axis-aligned 3D bounding boxes in depth coordinate system, i.e. (x_size, y_size, z_size), shape: [K, 3].
- - annotations['gt_boxes_upright_depth']: The axis-aligned 3D bounding boxes in depth coordinate system, each bounding box is (x, y, z, x_size, y_size, z_size), shape: [K, 6].
- - annotations['unaligned_location']: The gravity center of the axis-unaligned 3D bounding boxes in depth coordinate system.
- - annotations['unaligned_dimensions']: The dimensions of the axis-unaligned 3D bounding boxes in depth coordinate system.
- - annotations['unaligned_gt_boxes_upright_depth']: The axis-unaligned 3D bounding boxes in depth coordinate system.
- - annotations['index']: The index of all ground truths, i.e. [0, K).
- - annotations['class']: The train class ID of the bounding boxes, value range: [0, 18), shape: [K, ].
+ - info\['point_cloud'\]: {'num_features': 6, 'lidar_idx': sample_idx}.
+ - info\['pts_path'\]: The path of `points/xxxxx.bin`.
+ - info\['pts_instance_mask_path'\]: The path of `instance_mask/xxxxx.bin`.
+ - info\['pts_semantic_mask_path'\]: The path of `semantic_mask/xxxxx.bin`.
+ - info\['annos'\]: The annotations of each scan.
+ - annotations\['gt_num'\]: The number of ground truths.
+ - annotations\['name'\]: The semantic name of all ground truths, e.g. `chair`.
+ - annotations\['location'\]: The gravity center of the axis-aligned 3D bounding boxes in depth coordinate system. Shape: \[K, 3\], K is the number of ground truths.
+ - annotations\['dimensions'\]: The dimensions of the axis-aligned 3D bounding boxes in depth coordinate system, i.e. (x_size, y_size, z_size), shape: \[K, 3\].
+ - annotations\['gt_boxes_upright_depth'\]: The axis-aligned 3D bounding boxes in depth coordinate system, each bounding box is (x, y, z, x_size, y_size, z_size), shape: \[K, 6\].
+ - annotations\['unaligned_location'\]: The gravity center of the axis-unaligned 3D bounding boxes in depth coordinate system.
+ - annotations\['unaligned_dimensions'\]: The dimensions of the axis-unaligned 3D bounding boxes in depth coordinate system.
+ - annotations\['unaligned_gt_boxes_upright_depth'\]: The axis-unaligned 3D bounding boxes in depth coordinate system.
+ - annotations\['index'\]: The index of all ground truths, i.e. \[0, K).
+ - annotations\['class'\]: The train class ID of the bounding boxes, value range: \[0, 18), shape: \[K, \].
- `scannet_infos_val.pkl`: The val data infos, which shares the same format as `scannet_infos_train.pkl`.
- `scannet_infos_test.pkl`: The test data infos, which almost shares the same format as `scannet_infos_train.pkl` except for the lack of annotation.
-
## Training pipeline
A typical training pipeline of ScanNet for 3D detection is as follows.
@@ -291,11 +290,11 @@ train_pipeline = [
```
- `GlobalAlignment`: The previous point cloud would be axis-aligned using the axis-aligned matrix.
-- `PointSegClassMapping`: Only the valid category IDs will be mapped to class label IDs like [0, 18) during training.
+- `PointSegClassMapping`: Only the valid category IDs will be mapped to class label IDs like \[0, 18) during training.
- Data augmentation:
- - `PointSample`: downsample the input point cloud.
- - `RandomFlip3D`: randomly flip the input point cloud horizontally or vertically.
- - `GlobalRotScaleTrans`: rotate the input point cloud, usually in the range of [-5, 5] (degrees) for ScanNet; then scale the input point cloud, usually by 1.0 for ScanNet (which means no scaling); finally translate the input point cloud, usually by 0 for ScanNet (which means no translation).
+ - `PointSample`: downsample the input point cloud.
+ - `RandomFlip3D`: randomly flip the input point cloud horizontally or vertically.
+ - `GlobalRotScaleTrans`: rotate the input point cloud, usually in the range of \[-5, 5\] (degrees) for ScanNet; then scale the input point cloud, usually by 1.0 for ScanNet (which means no scaling); finally translate the input point cloud, usually by 0 for ScanNet (which means no translation).
## Metrics
diff --git a/docs/en/datasets/scannet_sem_seg.md b/docs/en/datasets/scannet_sem_seg.md
index 3c130ea0b2..edb13948f6 100644
--- a/docs/en/datasets/scannet_sem_seg.md
+++ b/docs/en/datasets/scannet_sem_seg.md
@@ -67,8 +67,8 @@ scannet
```
- `seg_info`: The generated infos to support semantic segmentation model training.
- - `train_label_weight.npy`: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it's a common practice to use label re-weighting to get a better performance.
- - `train_resampled_scene_idxs.npy`: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.
+ - `train_label_weight.npy`: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it's a common practice to use label re-weighting to get a better performance.
+ - `train_resampled_scene_idxs.npy`: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.
## Training pipeline
@@ -108,7 +108,7 @@ train_pipeline = [
]
```
-- `PointSegClassMapping`: Only the valid category ids will be mapped to class label ids like [0, 20) during training. Other class ids will be converted to `ignore_index` which equals to `20`.
+- `PointSegClassMapping`: Only the valid category ids will be mapped to class label ids like \[0, 20) during training. Other class ids will be converted to `ignore_index` which equals to `20`.
- `IndoorPatchPointSample`: Crop a patch containing a fixed number of points from input point cloud. `block_size` indicates the size of the cropped block, typically `1.5` for ScanNet.
- `NormalizePointsColor`: Normalize the RGB color values of input point cloud by dividing `255`.
diff --git a/docs/en/datasets/sunrgbd_det.md b/docs/en/datasets/sunrgbd_det.md
index ccb4511b70..16aa914480 100644
--- a/docs/en/datasets/sunrgbd_det.md
+++ b/docs/en/datasets/sunrgbd_det.md
@@ -116,7 +116,7 @@ Under each following folder there are overall 5285 train files and 5050 val file
- `label_v1`: Detection annotation data in `.txt` (version 1)
- `seg_label`: Segmentation annotation data in `.txt`
- Currently, we use v1 data for training and testing, so the version 2 labels are unused.
+Currently, we use v1 data for training and testing, so the version 2 labels are unused.
### Create dataset
@@ -240,25 +240,24 @@ sunrgbd
- `points/0xxxxx.bin`: The point cloud data after downsample.
- `sunrgbd_infos_train.pkl`: The train data infos, the detailed info of each scene is as follows:
- - info['point_cloud']: `·`{'num_features': 6, 'lidar_idx': sample_idx}`, where `sample_idx` is the index of the scene.
- - info['pts_path']: The path of `points/0xxxxx.bin`.
- - info['image']: The image path and metainfo:
- - image['image_idx']: The index of the image.
- - image['image_shape']: The shape of the image tensor.
- - image['image_path']: The path of the image.
- - info['annos']: The annotations of each scene.
- - annotations['gt_num']: The number of ground truths.
- - annotations['name']: The semantic name of all ground truths, e.g. `chair`.
- - annotations['location']: The gravity center of the 3D bounding boxes in depth coordinate system. Shape: [K, 3], K is the number of ground truths.
- - annotations['dimensions']: The dimensions of the 3D bounding boxes in depth coordinate system, i.e. `(x_size, y_size, z_size)`, shape: [K, 3].
- - annotations['rotation_y']: The yaw angle of the 3D bounding boxes in depth coordinate system. Shape: [K, ].
- - annotations['gt_boxes_upright_depth']: The 3D bounding boxes in depth coordinate system, each bounding box is `(x, y, z, x_size, y_size, z_size, yaw)`, shape: [K, 7].
- - annotations['bbox']: The 2D bounding boxes, each bounding box is `(x, y, x_size, y_size)`, shape: [K, 4].
- - annotations['index']: The index of all ground truths, range [0, K).
- - annotations['class']: The train class id of the bounding boxes, value range: [0, 10), shape: [K, ].
+ - info\['point_cloud'\]: `·`{'num_features': 6, 'lidar_idx': sample_idx}`, where `sample_idx\` is the index of the scene.
+ - info\['pts_path'\]: The path of `points/0xxxxx.bin`.
+ - info\['image'\]: The image path and metainfo:
+ - image\['image_idx'\]: The index of the image.
+ - image\['image_shape'\]: The shape of the image tensor.
+ - image\['image_path'\]: The path of the image.
+ - info\['annos'\]: The annotations of each scene.
+ - annotations\['gt_num'\]: The number of ground truths.
+ - annotations\['name'\]: The semantic name of all ground truths, e.g. `chair`.
+ - annotations\['location'\]: The gravity center of the 3D bounding boxes in depth coordinate system. Shape: \[K, 3\], K is the number of ground truths.
+ - annotations\['dimensions'\]: The dimensions of the 3D bounding boxes in depth coordinate system, i.e. `(x_size, y_size, z_size)`, shape: \[K, 3\].
+ - annotations\['rotation_y'\]: The yaw angle of the 3D bounding boxes in depth coordinate system. Shape: \[K, \].
+ - annotations\['gt_boxes_upright_depth'\]: The 3D bounding boxes in depth coordinate system, each bounding box is `(x, y, z, x_size, y_size, z_size, yaw)`, shape: \[K, 7\].
+ - annotations\['bbox'\]: The 2D bounding boxes, each bounding box is `(x, y, x_size, y_size)`, shape: \[K, 4\].
+ - annotations\['index'\]: The index of all ground truths, range \[0, K).
+ - annotations\['class'\]: The train class id of the bounding boxes, value range: \[0, 10), shape: \[K, \].
- `sunrgbd_infos_val.pkl`: The val data infos, which shares the same format as `sunrgbd_infos_train.pkl`.
-
## Train pipeline
A typical train pipeline of SUN RGB-D for point cloud only 3D detection is as follows.
@@ -289,8 +288,9 @@ train_pipeline = [
```
Data augmentation for point clouds:
+
- `RandomFlip3D`: randomly flip the input point cloud horizontally or vertically.
-- `GlobalRotScaleTrans`: rotate the input point cloud, usually in the range of [-30, 30] (degrees) for SUN RGB-D; then scale the input point cloud, usually in the range of [0.85, 1.15] for SUN RGB-D; finally translate the input point cloud, usually by 0 for SUN RGB-D (which means no translation).
+- `GlobalRotScaleTrans`: rotate the input point cloud, usually in the range of \[-30, 30\] (degrees) for SUN RGB-D; then scale the input point cloud, usually in the range of \[0.85, 1.15\] for SUN RGB-D; finally translate the input point cloud, usually by 0 for SUN RGB-D (which means no translation).
- `PointSample`: downsample the input point cloud.
A typical train pipeline of SUN RGB-D for multi-modality (point cloud and image) 3D detection is as follows.
@@ -332,6 +332,7 @@ train_pipeline = [
```
Data augmentation/normalization for images:
+
- `Resize`: resize the input image, `keep_ratio=True` means the ratio of the image is kept unchanged.
- `Normalize`: normalize the RGB channels of the input image.
- `RandomFlip`: randomly flip the input image.
diff --git a/docs/en/datasets/waymo_det.md b/docs/en/datasets/waymo_det.md
index d86d003b2f..a1772c9b99 100644
--- a/docs/en/datasets/waymo_det.md
+++ b/docs/en/datasets/waymo_det.md
@@ -103,36 +103,36 @@ Considering there are many similar frames in the original dataset, we can basica
For evaluation on Waymo, please follow the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/) to build the binary file `compute_detection_metrics_main` for metrics computation and put it into `mmdet3d/core/evaluation/waymo_utils/`. Basically, you can follow the commands below to install `bazel` and build the file.
- ```shell
- # download the code and enter the base directory
- git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od
- cd waymo-od
- git checkout remotes/origin/master
-
- # use the Bazel build system
- sudo apt-get install --assume-yes pkg-config zip g++ zlib1g-dev unzip python3 python3-pip
- BAZEL_VERSION=3.1.0
- wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
- sudo bash bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
- sudo apt install build-essential
-
- # configure .bazelrc
- ./configure.sh
- # delete previous bazel outputs and reset internal caches
- bazel clean
-
- bazel build waymo_open_dataset/metrics/tools/compute_detection_metrics_main
- cp bazel-bin/waymo_open_dataset/metrics/tools/compute_detection_metrics_main ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- ```
+```shell
+# download the code and enter the base directory
+git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od
+cd waymo-od
+git checkout remotes/origin/master
+
+# use the Bazel build system
+sudo apt-get install --assume-yes pkg-config zip g++ zlib1g-dev unzip python3 python3-pip
+BAZEL_VERSION=3.1.0
+wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
+sudo bash bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
+sudo apt install build-essential
+
+# configure .bazelrc
+./configure.sh
+# delete previous bazel outputs and reset internal caches
+bazel clean
+
+bazel build waymo_open_dataset/metrics/tools/compute_detection_metrics_main
+cp bazel-bin/waymo_open_dataset/metrics/tools/compute_detection_metrics_main ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+```
Then you can evaluate your models on Waymo. An example to evaluate PointPillars on Waymo with 8 GPUs with Waymo metrics is as follows.
- ```shell
- ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
- checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
- --eval waymo --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
- 'submission_prefix=results/waymo-car/kitti_results'
- ```
+```shell
+./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
+ checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
+ --eval waymo --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
+ 'submission_prefix=results/waymo-car/kitti_results'
+```
`pklfile_prefix` should be given in the `--eval-options` if the bin file is needed to be generated. For metrics, `waymo` is the recommended official evaluation prototype. Currently, evaluating with choice `kitti` is adapted from KITTI and the results for each difficulty are not exactly the same as the definition of KITTI. Instead, most of objects are marked with difficulty 0 currently, which will be fixed in the future. The reasons of its instability include the large computation for evaluation, the lack of occlusion and truncation in the converted data, different definitions of difficulty and different methods of computing Average Precision.
@@ -148,28 +148,28 @@ Then you can evaluate your models on Waymo. An example to evaluate PointPillars
An example to test PointPillars on Waymo with 8 GPUs, generate the bin files and make a submission to the leaderboard.
- ```shell
- ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
- checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
- --format-only --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
- 'submission_prefix=results/waymo-car/kitti_results'
- ```
+```shell
+./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
+ checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
+ --format-only --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
+ 'submission_prefix=results/waymo-car/kitti_results'
+```
After generating the bin file, you can simply build the binary file `create_submission` and use them to create a submission file by following the [instruction](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/). Basically, here are some example commands.
- ```shell
- cd ../waymo-od/
- bazel build waymo_open_dataset/metrics/tools/create_submission
- cp bazel-bin/waymo_open_dataset/metrics/tools/create_submission ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- vim waymo_open_dataset/metrics/tools/submission.txtpb # set the metadata information
- cp waymo_open_dataset/metrics/tools/submission.txtpb ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+```shell
+cd ../waymo-od/
+bazel build waymo_open_dataset/metrics/tools/create_submission
+cp bazel-bin/waymo_open_dataset/metrics/tools/create_submission ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+vim waymo_open_dataset/metrics/tools/submission.txtpb # set the metadata information
+cp waymo_open_dataset/metrics/tools/submission.txtpb ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- cd ../mmdetection3d
- # suppose the result bin is in `results/waymo-car/submission`
- mmdet3d/core/evaluation/waymo_utils/create_submission --input_filenames='results/waymo-car/kitti_results_test.bin' --output_filename='results/waymo-car/submission/model' --submission_filename='mmdet3d/core/evaluation/waymo_utils/submission.txtpb'
+cd ../mmdetection3d
+# suppose the result bin is in `results/waymo-car/submission`
+mmdet3d/core/evaluation/waymo_utils/create_submission --input_filenames='results/waymo-car/kitti_results_test.bin' --output_filename='results/waymo-car/submission/model' --submission_filename='mmdet3d/core/evaluation/waymo_utils/submission.txtpb'
- tar cvf results/waymo-car/submission/my_model.tar results/waymo-car/submission/my_model/
- gzip results/waymo-car/submission/my_model.tar
- ```
+tar cvf results/waymo-car/submission/my_model.tar results/waymo-car/submission/my_model/
+gzip results/waymo-car/submission/my_model.tar
+```
For evaluation on the validation set with the eval server, you can also use the same way to generate a submission. Make sure you change the fields in `submission.txtpb` before running the command above.
diff --git a/docs/en/faq.md b/docs/en/faq.md
index 0cb93b57b4..d7a3245ca5 100644
--- a/docs/en/faq.md
+++ b/docs/en/faq.md
@@ -6,7 +6,7 @@ We list some potential troubles encountered by users and developers, along with
- If you faced the error shown below when importing open3d:
- ``OSError: /lib/x86_64-linux-gnu/libm.so.6: version 'GLIBC_2.27' not found``
+ `OSError: /lib/x86_64-linux-gnu/libm.so.6: version 'GLIBC_2.27' not found`
please downgrade open3d to 0.9.0.0, because the latest open3d needs the support of file 'GLIBC_2.27', which only exists in Ubuntu 18.04, not in Ubuntu 16.04.
@@ -21,15 +21,15 @@ We list some potential troubles encountered by users and developers, along with
- If you face the error shown below when importing pycocotools:
- ``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject``
+ `ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`
- please downgrade pycocotools to 2.0.1 because of the incompatibility between the newest pycocotools and numpy < 1.20.0. Or you can compile and install the latest pycocotools from source as below:
+ please downgrade pycocotools to 2.0.1 because of the incompatibility between the newest pycocotools and numpy \< 1.20.0. Or you can compile and install the latest pycocotools from source as below:
- ``pip install -e "git+https://github.com/cocodataset/cocoapi#egg=pycocotools&subdirectory=PythonAPI"``
+ `pip install -e "git+https://github.com/cocodataset/cocoapi#egg=pycocotools&subdirectory=PythonAPI"`
or
- ``pip install -e "git+https://github.com/ppwwyyxx/cocoapi#egg=pycocotools&subdirectory=PythonAPI"``
+ `pip install -e "git+https://github.com/ppwwyyxx/cocoapi#egg=pycocotools&subdirectory=PythonAPI"`
## How to annotate point cloud?
diff --git a/docs/en/getting_started.md b/docs/en/getting_started.md
index a6ae3439ea..6cc237545a 100644
--- a/docs/en/getting_started.md
+++ b/docs/en/getting_started.md
@@ -7,33 +7,32 @@
- GCC 5+
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation)
-
The required versions of MMCV, MMDetection and MMSegmentation for different versions of MMDetection3D are as below. Please install the correct version of MMCV, MMDetection and MMSegmentation to avoid installation issues.
-| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version |
-| :-------------------: | :---------------------: | :--------------------: | :------------------------: |
-| master | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.7.0 |
-| v1.0.0rc2 | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.7.0 |
-| v1.0.0rc1 | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.5.0 |
-| v1.0.0rc0 | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0 |
-| 0.18.1 | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0 |
-| 0.18.0 | mmdet>=2.19.0, <=3.0.0 | mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0 |
-| 0.17.3 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.17.2 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.17.1 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.17.0 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.16.0 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.15.0 | mmdet>=2.14.0, <=3.0.0 | mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0 |
-| 0.14.0 | mmdet>=2.10.0, <=2.11.0 | mmseg==0.14.0 | mmcv-full>=1.3.1, <=1.4.0 |
-| 0.13.0 | mmdet>=2.10.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.4.0 |
-| 0.12.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.4.0 |
-| 0.11.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0 |
-| 0.10.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0 |
-| 0.9.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0 |
-| 0.8.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.1.5, <=1.3.0 |
-| 0.7.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.1.5, <=1.3.0 |
-| 0.6.0 | mmdet>=2.4.0, <=2.11.0 | Not required | mmcv-full>=1.1.3, <=1.2.0 |
-| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5 |
+| MMDetection3D version | MMDetection version | MMSegmentation version | MMCV version |
+| :-------------------: | :----------------------: | :---------------------: | :-------------------------: |
+| master | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.7.0 |
+| v1.0.0rc2 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.7.0 |
+| v1.0.0rc1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.5.0 |
+| v1.0.0rc0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.18.1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.18.0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.17.3 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.2 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.1 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.16.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.15.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.14.0 | mmdet>=2.10.0, \<=2.11.0 | mmseg==0.14.0 | mmcv-full>=1.3.1, \<=1.4.0 |
+| 0.13.0 | mmdet>=2.10.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
+| 0.12.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
+| 0.11.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.10.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.9.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.8.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
+| 0.7.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
+| 0.6.0 | mmdet>=2.4.0, \<=2.11.0 | Not required | mmcv-full>=1.1.3, \<=1.2.0 |
+| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5 |
# Installation
@@ -176,20 +175,20 @@ pip install -v -e . # or "python setup.py develop"
Note:
1. The git commit id will be written to the version number with step d, e.g. 0.6.0+2e7045c. The version will also be saved in trained models.
-It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory.
+ It is recommended that you run step d each time you pull some updates from github. If C++/CUDA codes are modified, then this step is compulsory.
- > Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version.
+ > Important: Be sure to remove the `./build` folder if you reinstall mmdet with a different CUDA/PyTorch version.
- ```shell
- pip uninstall mmdet3d
- rm -rf ./build
- find . -name "*.so" | xargs rm
- ```
+ ```shell
+ pip uninstall mmdet3d
+ rm -rf ./build
+ find . -name "*.so" | xargs rm
+ ```
2. Following the above instructions, MMDetection3D is installed on `dev` mode, any local modifications made to the code will take effect without the need to reinstall it (unless you submit some commits and want to update the version number).
3. If you would like to use `opencv-python-headless` instead of `opencv-python`,
-you can install it before installing MMCV.
+ you can install it before installing MMCV.
4. Some dependencies are optional. Simply running `pip install -v -e .` will only install the minimum runtime requirements. To use optional dependencies like `albumentations` and `imagecorruptions` either install them manually with `pip install -r requirements/optional.txt` or specify desired extras when calling `pip` (e.g. `pip install -v -e .[optional]`). Valid keys for the extras field are: `all`, `tests`, `build`, and `optional`.
@@ -208,10 +207,10 @@ you can install it before installing MMCV.
We also support Minkowski Engine as a sparse convolution backend. If necessary please follow original [installation guide](https://github.com/NVIDIA/MinkowskiEngine#installation) or use `pip`:
- ```shell
- conda install openblas-devel -c anaconda
- pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=/opt/conda/include" --install-option="--blas=openblas"
- ```
+ ```shell
+ conda install openblas-devel -c anaconda
+ pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=/opt/conda/include" --install-option="--blas=openblas"
+ ```
5. The code can not be built for CPU only environment (where CUDA isn't available) for now.
@@ -283,7 +282,7 @@ python demo/pcd_demo.py demo/data/kitti/kitti_000008.bin configs/second/hv_secon
```
If you want to input a `ply` file, you can use the following function and convert it to `bin` format. Then you can use the converted `bin` file to generate demo.
-Note that you need to install `pandas` and `plyfile` before using this script. This function can also be used for data preprocessing for training ```ply data```.
+Note that you need to install `pandas` and `plyfile` before using this script. This function can also be used for data preprocessing for training `ply data`.
```python
import numpy as np
diff --git a/docs/en/model_zoo.md b/docs/en/model_zoo.md
index d5f625f820..5bbd081376 100644
--- a/docs/en/model_zoo.md
+++ b/docs/en/model_zoo.md
@@ -33,9 +33,11 @@ Please refer to [Dynamic Voxelization](https://github.com/open-mmlab/mmdetection
Please refer to [MVXNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/mvxnet) for details.
### RegNetX
+
Please refer to [RegNet](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/regnet) for details. We provide pointpillars baselines with RegNetX backbones on nuScenes and Lyft datasets currently.
### nuImages
+
We also support baseline models on [nuImages dataset](https://www.nuscenes.org/nuimages). Please refer to [nuImages](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/nuimages) for details. We report Mask R-CNN, Cascade Mask R-CNN and HTC results currently.
### H3DNet
diff --git a/docs/en/tutorials/backends_support.md b/docs/en/tutorials/backends_support.md
index f075a277a2..5304ccd6f4 100644
--- a/docs/en/tutorials/backends_support.md
+++ b/docs/en/tutorials/backends_support.md
@@ -94,9 +94,8 @@ data = dict(
test=dict(pipeline=test_pipeline, classes=class_names, file_client_args=file_client_args))
```
-
-
## Load pretrained model from Ceph
+
```python
model = dict(
pts_backbone=dict(
@@ -109,6 +108,7 @@ model = dict(
```
## Load checkpoint from Ceph
+
```python
# replace the path with your checkpoint path on Ceph
load_from = 's3://openmmlab/checkpoints/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230614-77663cd6.pth.pth'
@@ -132,6 +132,7 @@ evaluation = dict(interval=1, save_best='bbox', out_dir='s3://openmmlab/mmdetect
```
## Save the training log into Ceph
+
The training log will be backed up to the specified Ceph path after training.
```python
@@ -141,6 +142,7 @@ log_config = dict(
dict(type='TextLoggerHook', out_dir='s3://openmmlab/mmdetection3d'),
])
```
+
You can also delete the local training log after backing up to the specified Ceph path by setting `keep_local = False`.
```python
diff --git a/docs/en/tutorials/config.md b/docs/en/tutorials/config.md
index c6367049ce..9b4261f796 100644
--- a/docs/en/tutorials/config.md
+++ b/docs/en/tutorials/config.md
@@ -34,14 +34,14 @@ We follow the below style to name config files. Contributors are advised to foll
- `{backbone}`: backbone type like `regnet-400mf`, `regnet-1.6gf`.
- `[neck]`: neck type like `fpn`, `secfpn`.
- `[norm_setting]`: `bn` (Batch Normalization) is used unless specified, other norm layer type could be `gn` (Group Normalization), `sbn` (Synchronized Batch Normalization).
-`gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head.
+ `gn-head`/`gn-neck` indicates GN is applied in head/neck only, while `gn-all` means GN is applied in the entire model, e.g. backbone, neck, head.
- `[misc]`: miscellaneous setting/plugins of model, e.g. `strong-aug` means using stronger augmentation strategies for training.
- `[batch_per_gpu x gpu]`: samples per GPU and GPUs, `4x8` is used by default.
- `{schedule}`: training schedule, options are `1x`, `2x`, `20e`, etc.
-`1x` and `2x` means 12 epochs and 24 epochs respectively.
-`20e` is adopted in cascade models, which denotes 20 epochs.
-For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs.
-For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
+ `1x` and `2x` means 12 epochs and 24 epochs respectively.
+ `20e` is adopted in cascade models, which denotes 20 epochs.
+ For `1x`/`2x`, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs.
+ For `20e`, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
- `{dataset}`: dataset like `nus-3d`, `kitti-3d`, `lyft-3d`, `scannet-3d`, `sunrgbd-3d`. We also indicate the number of classes we are using if there exist multiple settings, e.g., `kitti-3d-3class` and `kitti-3d-car` means training on KITTI dataset with 3 classes and single class, respectively.
## Deprecated train_cfg/test_cfg
diff --git a/docs/en/tutorials/coord_sys_tutorial.md b/docs/en/tutorials/coord_sys_tutorial.md
index 7b784b4db0..4ddb2b865b 100644
--- a/docs/en/tutorials/coord_sys_tutorial.md
+++ b/docs/en/tutorials/coord_sys_tutorial.md
@@ -7,43 +7,43 @@ MMDetection3D uses three different coordinate systems. The existence of differen
Despite the variety of datasets and equipment, by summarizing the line of works on 3D object detection we can roughly categorize coordinate systems into three:
- Camera coordinate system -- the coordinate system of most cameras, in which the positive direction of the y-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the z-axis points to the front.
- ```
- up z front
- | ^
- | /
- | /
- | /
- |/
- left ------ 0 ------> x right
- |
- |
- |
- |
- v
- y down
- ```
+ ```
+ up z front
+ | ^
+ | /
+ | /
+ | /
+ |/
+ left ------ 0 ------> x right
+ |
+ |
+ |
+ |
+ v
+ y down
+ ```
- LiDAR coordinate system -- the coordinate system of many LiDARs, in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the front, and the positive direction of the y-axis points to the left.
- ```
- z up x front
- ^ ^
- | /
- | /
- | /
- |/
- y left <------ 0 ------ right
- ```
+ ```
+ z up x front
+ ^ ^
+ | /
+ | /
+ | /
+ |/
+ y left <------ 0 ------ right
+ ```
- Depth coordinate system -- the coordinate system used by VoteNet, H3DNet, etc., in which the negative direction of the z-axis points to the ground, the positive direction of the x-axis points to the right, and the positive direction of the y-axis points to the front.
- ```
- z up y front
- ^ ^
- | /
- | /
- | /
- |/
- left ------ 0 ------> x right
- ```
-
-The definition of coordinate systems in this tutorial is actually **more than just defining the three axes**. For a box in the form of ``$$`(x, y, z, dx, dy, dz, r)`$$``, our coordinate systems also define how to interpret the box dimensions ``$$`(dx, dy, dz)`$$`` and the yaw angle ``$$`r`$$``.
+ ```
+ z up y front
+ ^ ^
+ | /
+ | /
+ | /
+ |/
+ left ------ 0 ------> x right
+ ```
+
+The definition of coordinate systems in this tutorial is actually **more than just defining the three axes**. For a box in the form of `` $$`(x, y, z, dx, dy, dz, r)`$$ ``, our coordinate systems also define how to interpret the box dimensions `` $$`(dx, dy, dz)`$$ `` and the yaw angle `` $$`r`$$ ``.
The illustration of the three coordinate systems is shown below:
@@ -55,13 +55,13 @@ We will stick to the three coordinate systems defined in this tutorial in the fu
## Definition of the yaw angle
-Please refer to [wikipedia](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles) for the standard definition of the yaw angle. In object detection, we choose an axis as the gravity axis, and a reference direction on the plane ``$$`\Pi`$$`` perpendicular to the gravity axis, then the reference direction has a yaw angle of 0, and other directions on ``$$`\Pi`$$`` have non-zero yaw angles depending on its angle with the reference direction.
+Please refer to [wikipedia](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles) for the standard definition of the yaw angle. In object detection, we choose an axis as the gravity axis, and a reference direction on the plane `` $$`\Pi`$$ `` perpendicular to the gravity axis, then the reference direction has a yaw angle of 0, and other directions on `` $$`\Pi`$$ `` have non-zero yaw angles depending on its angle with the reference direction.
Currently, for all supported datasets, annotations do not include pitch angle and roll angle, which means we need only consider the yaw angle when predicting boxes and calculating overlap between boxes.
In MMDetection3D, all three coordinate systems are right-handed coordinate systems, which means the ascending direction of the yaw angle is counter-clockwise if viewed from the negative direction of the gravity axis (the axis is pointing at one's eyes).
-The figure below shows that, in this right-handed coordinate system, if we set the positive direction of the x-axis as a reference direction, then the positive direction of the y-axis has a yaw angle of ``$$`\frac{\pi}{2}`$$``.
+The figure below shows that, in this right-handed coordinate system, if we set the positive direction of the x-axis as a reference direction, then the positive direction of the y-axis has a yaw angle of `` $$`\frac{\pi}{2}`$$ ``.
```
z up y front (yaw=0.5*pi)
@@ -92,9 +92,9 @@ __|____|____|____|______\ x right
## Definition of the box dimensions
-The definition of the box dimensions cannot be disentangled with the definition of the yaw angle. In the previous section, we said that the direction of a box is defined to be parallel with the x-axis if its yaw angle is 0. Then naturally, the dimension of a box which corresponds to the x-axis should be ``$$`dx`$$``. However, this is not always the case in some datasets (we will address that later).
+The definition of the box dimensions cannot be disentangled with the definition of the yaw angle. In the previous section, we said that the direction of a box is defined to be parallel with the x-axis if its yaw angle is 0. Then naturally, the dimension of a box which corresponds to the x-axis should be `` $$`dx`$$ ``. However, this is not always the case in some datasets (we will address that later).
-The following figures show the meaning of the correspondence between the x-axis and ``$$`dx`$$``, and between the y-axis and ``$$`dy`$$``.
+The following figures show the meaning of the correspondence between the x-axis and `` $$`dx`$$ ``, and between the y-axis and `` $$`dy`$$ ``.
```
y front
@@ -111,7 +111,7 @@ __|____|____|____|______\ x right
| dy
```
-Note that the box direction is always parallel with the edge ``$$`dx`$$``.
+Note that the box direction is always parallel with the edge `` $$`dx`$$ ``.
```
y front
@@ -138,14 +138,12 @@ In SECOND, the LiDAR coordinate system for a box is defined as follows (a bird's

-
-
-For each box, the dimensions are ``$$`(w, l, h)`$$``, and the reference direction for the yaw angle is the positive direction of the y axis. For more details, refer to the [repo](https://github.com/traveller59/second.pytorch#concepts).
+For each box, the dimensions are `` $$`(w, l, h)`$$ ``, and the reference direction for the yaw angle is the positive direction of the y axis. For more details, refer to the [repo](https://github.com/traveller59/second.pytorch#concepts).
Our LiDAR coordinate system has two changes:
- The yaw angle is defined to be right-handed instead of left-handed for consistency;
-- The box dimensions are ``$$`(l, w, h)`$$`` instead of ``$$`(w, l, h)`$$``, since ``$$`w`$$`` corresponds to ``$$`dy`$$`` and ``$$`l`$$`` corresponds to ``$$`dx`$$`` in KITTI.
+- The box dimensions are `` $$`(l, w, h)`$$ `` instead of `` $$`(w, l, h)`$$ ``, since `` $$`w`$$ `` corresponds to `` $$`dy`$$ `` and `` $$`l`$$ `` corresponds to `` $$`dx`$$ `` in KITTI.
### Waymo
@@ -153,7 +151,7 @@ We use the KITTI-format data of Waymo dataset. Therefore, KITTI and Waymo also s
### NuScenes
-NuScenes provides a toolkit for evaluation, in which each box is wrapped into a `Box` instance. The coordinate system of `Box` is different from our LiDAR coordinate system in that the first two elements of the box dimension correspond to ``$$`(dy, dx)`$$``, or ``$$`(w, l)`$$``, respectively, instead of the reverse. For more details, please refer to the NuScenes [tutorial](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/nuscenes_det.md#notes).
+NuScenes provides a toolkit for evaluation, in which each box is wrapped into a `Box` instance. The coordinate system of `Box` is different from our LiDAR coordinate system in that the first two elements of the box dimension correspond to `` $$`(dy, dx)`$$ ``, or `` $$`(w, l)`$$ ``, respectively, instead of the reverse. For more details, please refer to the NuScenes [tutorial](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/nuscenes_det.md#notes).
Readers may refer to the [NuScenes development kit](https://github.com/nutonomy/nuscenes-devkit/tree/master/python-sdk/nuscenes/eval/detection) for the definition of a [NuScenes box](https://github.com/nutonomy/nuscenes-devkit/blob/2c6a752319f23910d5f55cc995abc547a9e54142/python-sdk/nuscenes/utils/data_classes.py#L457) and implementation of [NuScenes evaluation](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/eval/detection/evaluate.py).
@@ -185,25 +183,25 @@ Take the conversion between our Camera coordinate system and LiDAR coordinate sy
First, for points and box centers, the coordinates before and after the conversion satisfy the following relationship:
-- ``$$`x_{LiDAR}=z_{camera}`$$``
-- ``$$`y_{LiDAR}=-x_{camera}`$$``
-- ``$$`z_{LiDAR}=-y_{camera}`$$``
+- `` $$`x_{LiDAR}=z_{camera}`$$ ``
+- `` $$`y_{LiDAR}=-x_{camera}`$$ ``
+- `` $$`z_{LiDAR}=-y_{camera}`$$ ``
Then, the box dimensions before and after the conversion satisfy the following relationship:
-- ``$$`dx_{LiDAR}=dx_{camera}`$$``
-- ``$$`dy_{LiDAR}=dz_{camera}`$$``
-- ``$$`dz_{LiDAR}=dy_{camera}`$$``
+- `` $$`dx_{LiDAR}=dx_{camera}`$$ ``
+- `` $$`dy_{LiDAR}=dz_{camera}`$$ ``
+- `` $$`dz_{LiDAR}=dy_{camera}`$$ ``
Finally, the yaw angle should also be converted:
-- ``$$`r_{LiDAR}=-\frac{\pi}{2}-r_{camera}`$$``
+- `` $$`r_{LiDAR}=-\frac{\pi}{2}-r_{camera}`$$ ``
See the code [here](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/box_3d_mode.py) for more details.
### Bird's Eye View
-The BEV of a camera coordinate system box is ``$$`(x, z, dx, dz, -r)`$$`` if the 3D box is ``$$`(x, y, z, dx, dy, dz, r)`$$``. The inversion of the sign of the yaw angle is because the positive direction of the gravity axis of the Camera coordinate system points to the ground.
+The BEV of a camera coordinate system box is `` $$`(x, z, dx, dz, -r)`$$ `` if the 3D box is `` $$`(x, y, z, dx, dy, dz, r)`$$ ``. The inversion of the sign of the yaw angle is because the positive direction of the gravity axis of the Camera coordinate system points to the ground.
See the code [here](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/cam_box3d.py) for more details.
@@ -225,18 +223,18 @@ For each box related op, we have marked the type of boxes to which we can apply
No. For example, in KITTI, we need a calibration matrix when converting from Camera coordinate system to LiDAR coordinate system.
-#### Q3: How does a phase difference of ``$$`2\pi`$$`` in the yaw angle of a box affect evaluation?
+#### Q3: How does a phase difference of `` $$`2\pi`$$ `` in the yaw angle of a box affect evaluation?
-For IoU calculation, a phase difference of ``$$`2\pi`$$`` in the yaw angle will result in the same box, thus not affecting evaluation.
+For IoU calculation, a phase difference of `` $$`2\pi`$$ `` in the yaw angle will result in the same box, thus not affecting evaluation.
-For angle prediction evaluation such as the NDS metric in NuScenes and the AOS metric in KITTI, the angle of predicted boxes will be first standardized, so the phase difference of ``$$`2\pi`$$`` will not change the result.
+For angle prediction evaluation such as the NDS metric in NuScenes and the AOS metric in KITTI, the angle of predicted boxes will be first standardized, so the phase difference of `` $$`2\pi`$$ `` will not change the result.
-#### Q4: How does a phase difference of ``$$`\pi`$$`` in the yaw angle of a box affect evaluation?
+#### Q4: How does a phase difference of `` $$`\pi`$$ `` in the yaw angle of a box affect evaluation?
-For IoU calculation, a phase difference of ``$$`\pi`$$`` in the yaw angle will result in the same box, thus not affecting evaluation.
+For IoU calculation, a phase difference of `` $$`\pi`$$ `` in the yaw angle will result in the same box, thus not affecting evaluation.
However, for angle prediction evaluation, this will result in the exact opposite direction.
-Just think about a car. The yaw angle is the angle between the direction of the car front and the positive direction of the x-axis. If we add ``$$`\pi`$$`` to this angle, the car front will become the car rear.
+Just think about a car. The yaw angle is the angle between the direction of the car front and the positive direction of the x-axis. If we add `` $$`\pi`$$ `` to this angle, the car front will become the car rear.
-For categories such as barrier, the front and the rear have no difference, therefore a phase difference of ``$$`\pi`$$`` will not affect the angle prediction score.
+For categories such as barrier, the front and the rear have no difference, therefore a phase difference of `` $$`\pi`$$ `` will not affect the angle prediction score.
diff --git a/docs/en/tutorials/customize_dataset.md b/docs/en/tutorials/customize_dataset.md
index c1f5d4f599..772cd0abea 100644
--- a/docs/en/tutorials/customize_dataset.md
+++ b/docs/en/tutorials/customize_dataset.md
@@ -222,62 +222,62 @@ There are three ways to concatenate the dataset.
1. If the datasets you want to concatenate are in the same type with different annotation files, you can concatenate the dataset configs like the following.
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- pipeline=train_pipeline
- )
- ```
-
- If the concatenated dataset is used for test or evaluation, this manner supports to evaluate each dataset separately. To test the concatenated datasets as a whole, you can set `separate_eval=False` as below.
-
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- separate_eval=False,
- pipeline=train_pipeline
- )
- ```
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ pipeline=train_pipeline
+ )
+ ```
+
+ If the concatenated dataset is used for test or evaluation, this manner supports to evaluate each dataset separately. To test the concatenated datasets as a whole, you can set `separate_eval=False` as below.
+
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ separate_eval=False,
+ pipeline=train_pipeline
+ )
+ ```
2. In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
- ```python
- dataset_A_train = dict()
- dataset_B_train = dict()
-
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train = [
- dataset_A_train,
- dataset_B_train
- ],
- val = dataset_A_val,
- test = dataset_A_test
- )
- ```
+ ```python
+ dataset_A_train = dict()
+ dataset_B_train = dict()
+
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train = [
+ dataset_A_train,
+ dataset_B_train
+ ],
+ val = dataset_A_val,
+ test = dataset_A_test
+ )
+ ```
- If the concatenated dataset is used for test or evaluation, this manner also supports to evaluate each dataset separately.
+ If the concatenated dataset is used for test or evaluation, this manner also supports to evaluate each dataset separately.
3. We also support to define `ConcatDataset` explicitly as the following.
- ```python
- dataset_A_val = dict()
- dataset_B_val = dict()
-
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train=dataset_A_train,
- val=dict(
- type='ConcatDataset',
- datasets=[dataset_A_val, dataset_B_val],
- separate_eval=False))
- ```
-
- This manner allows users to evaluate all the datasets as a single one by setting `separate_eval=False`.
+ ```python
+ dataset_A_val = dict()
+ dataset_B_val = dict()
+
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train=dataset_A_train,
+ val=dict(
+ type='ConcatDataset',
+ datasets=[dataset_A_val, dataset_B_val],
+ separate_eval=False))
+ ```
+
+ This manner allows users to evaluate all the datasets as a single one by setting `separate_eval=False`.
**Note:**
diff --git a/docs/en/tutorials/customize_runtime.md b/docs/en/tutorials/customize_runtime.md
index 7cdc4bbc00..8b5596e61c 100644
--- a/docs/en/tutorials/customize_runtime.md
+++ b/docs/en/tutorials/customize_runtime.md
@@ -41,8 +41,8 @@ To find the above module defined above, this module should be imported into the
- Add `mmdet3d/core/optimizer/__init__.py` to import it.
- The newly defined module should be imported in `mmdet3d/core/optimizer/__init__.py` so that the registry will
- find the new module and add it:
+ The newly defined module should be imported in `mmdet3d/core/optimizer/__init__.py` so that the registry will
+ find the new module and add it:
```python
from .my_optimizer import MyOptimizer
@@ -116,35 +116,35 @@ Tricks not implemented by the optimizer should be implemented through optimizer
- __Use gradient clip to stabilize training__:
- Some models need gradient clip to clip the gradients to stabilize the training process. An example is as below:
+ Some models need gradient clip to clip the gradients to stabilize the training process. An example is as below:
- ```python
- optimizer_config = dict(
- _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
- ```
+ ```python
+ optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+ ```
- If your config inherits the base config which already sets the `optimizer_config`, you might need `_delete_=True` to override the unnecessary settings in the base config. See the [config documentation](https://mmdetection.readthedocs.io/en/latest/tutorials/config.html) for more details.
+ If your config inherits the base config which already sets the `optimizer_config`, you might need `_delete_=True` to override the unnecessary settings in the base config. See the [config documentation](https://mmdetection.readthedocs.io/en/latest/tutorials/config.html) for more details.
- __Use momentum schedule to accelerate model convergence__:
- We support momentum scheduler to modify model's momentum according to learning rate, which could make the model converge in a faster way.
- Momentum scheduler is usually used with LR scheduler, for example, the following config is used in 3D detection to accelerate convergence.
- For more details, please refer to the implementation of [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L358) and [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/momentum_updater.py#L225).
-
- ```python
- lr_config = dict(
- policy='cyclic',
- target_ratio=(10, 1e-4),
- cyclic_times=1,
- step_ratio_up=0.4,
- )
- momentum_config = dict(
- policy='cyclic',
- target_ratio=(0.85 / 0.95, 1),
- cyclic_times=1,
- step_ratio_up=0.4,
- )
- ```
+ We support momentum scheduler to modify model's momentum according to learning rate, which could make the model converge in a faster way.
+ Momentum scheduler is usually used with LR scheduler, for example, the following config is used in 3D detection to accelerate convergence.
+ For more details, please refer to the implementation of [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L358) and [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/momentum_updater.py#L225).
+
+ ```python
+ lr_config = dict(
+ policy='cyclic',
+ target_ratio=(10, 1e-4),
+ cyclic_times=1,
+ step_ratio_up=0.4,
+ )
+ momentum_config = dict(
+ policy='cyclic',
+ target_ratio=(0.85 / 0.95, 1),
+ cyclic_times=1,
+ step_ratio_up=0.4,
+ )
+ ```
## Customize training schedules
@@ -153,20 +153,20 @@ We support many other learning rate schedule [here](https://github.com/open-mmla
- Poly schedule:
- ```python
- lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
- ```
+ ```python
+ lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+ ```
- ConsineAnnealing schedule:
- ```python
- lr_config = dict(
- policy='CosineAnnealing',
- warmup='linear',
- warmup_iters=1000,
- warmup_ratio=1.0 / 10,
- min_lr_ratio=1e-5)
- ```
+ ```python
+ lr_config = dict(
+ policy='CosineAnnealing',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=1.0 / 10,
+ min_lr_ratio=1e-5)
+ ```
## Customize workflow
@@ -240,8 +240,8 @@ Then we need to make `MyHook` imported. Assuming the hook is in `mmdet3d/core/ut
- Modify `mmdet3d/core/utils/__init__.py` to import it.
- The newly defined module should be imported in `mmdet3d/core/utils/__init__.py` so that the registry will
- find the new module and add it:
+ The newly defined module should be imported in `mmdet3d/core/utils/__init__.py` so that the registry will
+ find the new module and add it:
```python
from .my_hook import MyHook
diff --git a/docs/en/tutorials/data_pipeline.md b/docs/en/tutorials/data_pipeline.md
index 87e99ba523..60dc18728f 100644
--- a/docs/en/tutorials/data_pipeline.md
+++ b/docs/en/tutorials/data_pipeline.md
@@ -86,100 +86,113 @@ For each operation, we list the related dict fields that are added/updated/remov
### Data loading
`LoadPointsFromFile`
+
- add: points
`LoadPointsFromMultiSweeps`
+
- update: points
`LoadAnnotations3D`
+
- add: gt_bboxes_3d, gt_labels_3d, gt_bboxes, gt_labels, pts_instance_mask, pts_semantic_mask, bbox3d_fields, pts_mask_fields, pts_seg_fields
### Pre-processing
`GlobalRotScaleTrans`
+
- add: pcd_trans, pcd_rotation, pcd_scale_factor
-- update: points, *bbox3d_fields
+- update: points, \*bbox3d_fields
`RandomFlip3D`
+
- add: flip, pcd_horizontal_flip, pcd_vertical_flip
-- update: points, *bbox3d_fields
+- update: points, \*bbox3d_fields
`PointsRangeFilter`
+
- update: points
`ObjectRangeFilter`
+
- update: gt_bboxes_3d, gt_labels_3d
`ObjectNameFilter`
+
- update: gt_bboxes_3d, gt_labels_3d
`PointShuffle`
+
- update: points
`PointsRangeFilter`
+
- update: points
### Formatting
`DefaultFormatBundle3D`
+
- update: points, gt_bboxes_3d, gt_labels_3d, gt_bboxes, gt_labels
`Collect3D`
+
- add: img_meta (the keys of img_meta is specified by `meta_keys`)
- remove: all other keys except for those specified by `keys`
### Test time augmentation
`MultiScaleFlipAug`
+
- update: scale, pcd_scale_factor, flip, flip_direction, pcd_horizontal_flip, pcd_vertical_flip with list of augmented data with these specific parameters
## Extend and use custom pipelines
1. Write a new pipeline in any file, e.g., `my_pipeline.py`. It takes a dict as input and return a dict.
- ```python
- from mmdet.datasets import PIPELINES
+ ```python
+ from mmdet.datasets import PIPELINES
- @PIPELINES.register_module()
- class MyTransform:
+ @PIPELINES.register_module()
+ class MyTransform:
- def __call__(self, results):
- results['dummy'] = True
- return results
- ```
+ def __call__(self, results):
+ results['dummy'] = True
+ return results
+ ```
2. Import the new class.
- ```python
- from .my_pipeline import MyTransform
- ```
+ ```python
+ from .my_pipeline import MyTransform
+ ```
3. Use it in config files.
- ```python
- train_pipeline = [
- dict(
- type='LoadPointsFromFile',
- load_dim=5,
- use_dim=5,
- file_client_args=file_client_args),
- dict(
- type='LoadPointsFromMultiSweeps',
- sweeps_num=10,
- file_client_args=file_client_args),
- dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
- dict(
- type='GlobalRotScaleTrans',
- rot_range=[-0.3925, 0.3925],
- scale_ratio_range=[0.95, 1.05],
- translation_std=[0, 0, 0]),
- dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
- dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
- dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
- dict(type='ObjectNameFilter', classes=class_names),
- dict(type='MyTransform'),
- dict(type='PointShuffle'),
- dict(type='DefaultFormatBundle3D', class_names=class_names),
- dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
- ]
- ```
+ ```python
+ train_pipeline = [
+ dict(
+ type='LoadPointsFromFile',
+ load_dim=5,
+ use_dim=5,
+ file_client_args=file_client_args),
+ dict(
+ type='LoadPointsFromMultiSweeps',
+ sweeps_num=10,
+ file_client_args=file_client_args),
+ dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
+ dict(
+ type='GlobalRotScaleTrans',
+ rot_range=[-0.3925, 0.3925],
+ scale_ratio_range=[0.95, 1.05],
+ translation_std=[0, 0, 0]),
+ dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
+ dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
+ dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
+ dict(type='ObjectNameFilter', classes=class_names),
+ dict(type='MyTransform'),
+ dict(type='PointShuffle'),
+ dict(type='DefaultFormatBundle3D', class_names=class_names),
+ dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
+ ]
+ ```
diff --git a/docs/en/tutorials/model_deployment.md b/docs/en/tutorials/model_deployment.md
index b1ffe53310..ff8c06abe2 100644
--- a/docs/en/tutorials/model_deployment.md
+++ b/docs/en/tutorials/model_deployment.md
@@ -37,17 +37,17 @@ python ./tools/deploy.py \
### Description of all arguments
-* `deploy_cfg` : The path of deploy config file in MMDeploy codebase.
-* `model_cfg` : The path of model config file in OpenMMLab codebase.
-* `checkpoint` : The path of model checkpoint file.
-* `img` : The path of point cloud file or image file that used to convert model.
-* `--test-img` : The path of image file that used to test model. If not specified, it will be set to `None`.
-* `--work-dir` : The path of work directory that used to save logs and models.
-* `--calib-dataset-cfg` : Only valid in int8 mode. Config used for calibration. If not specified, it will be set to `None` and use "val" dataset in model config for calibration.
-* `--device` : The device used for conversion. If not specified, it will be set to `cpu`.
-* `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
-* `--show` : Whether to show detection outputs.
-* `--dump-info` : Whether to output information for SDK.
+- `deploy_cfg` : The path of deploy config file in MMDeploy codebase.
+- `model_cfg` : The path of model config file in OpenMMLab codebase.
+- `checkpoint` : The path of model checkpoint file.
+- `img` : The path of point cloud file or image file that used to convert model.
+- `--test-img` : The path of image file that used to test model. If not specified, it will be set to `None`.
+- `--work-dir` : The path of work directory that used to save logs and models.
+- `--calib-dataset-cfg` : Only valid in int8 mode. Config used for calibration. If not specified, it will be set to `None` and use "val" dataset in model config for calibration.
+- `--device` : The device used for conversion. If not specified, it will be set to `cpu`.
+- `--log-level` : To set log level which in `'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'`. If not specified, it will be set to `INFO`.
+- `--show` : Whether to show detection outputs.
+- `--dump-info` : Whether to output information for SDK.
### Example
@@ -110,12 +110,12 @@ python tools/test.py \
## Supported models
-| Model | TorchScript | OnnxRuntime | TensorRT | NCNN | PPLNN | OpenVINO | Model config |
-| -------------------- | :---------: | :---------: | :------: | :---: | :---: | :------: | -------------------------------------------------------------------------------------- |
-| PointPillars | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
-| CenterPoint (pillar) | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
+| Model | TorchScript | OnnxRuntime | TensorRT | NCNN | PPLNN | OpenVINO | Model config |
+| -------------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | -------------------------------------------------------------------------------------- |
+| PointPillars | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
+| CenterPoint (pillar) | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
## Note
-* MMDeploy version >= 0.4.0.
-* Currently, CenterPoint has only supported the pillar version.
+- MMDeploy version >= 0.4.0.
+- Currently, CenterPoint has only supported the pillar version.
diff --git a/docs/en/tutorials/pure_point_cloud_dataset.md b/docs/en/tutorials/pure_point_cloud_dataset.md
index 495ae43399..fec05501ee 100644
--- a/docs/en/tutorials/pure_point_cloud_dataset.md
+++ b/docs/en/tutorials/pure_point_cloud_dataset.md
@@ -261,62 +261,62 @@ There are three ways to concatenate the dataset.
1. If the datasets you want to concatenate are in the same type with different annotation files, you can concatenate the dataset configs like the following.
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- pipeline=train_pipeline
- )
- ```
-
- If the concatenated dataset is used for test or evaluation, this manner supports to evaluate each dataset separately. To test the concatenated datasets as a whole, you can set `separate_eval=False` as below.
-
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- separate_eval=False,
- pipeline=train_pipeline
- )
- ```
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ pipeline=train_pipeline
+ )
+ ```
+
+ If the concatenated dataset is used for test or evaluation, this manner supports to evaluate each dataset separately. To test the concatenated datasets as a whole, you can set `separate_eval=False` as below.
+
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ separate_eval=False,
+ pipeline=train_pipeline
+ )
+ ```
2. In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
- ```python
- dataset_A_train = dict()
- dataset_B_train = dict()
-
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train = [
- dataset_A_train,
- dataset_B_train
- ],
- val = dataset_A_val,
- test = dataset_A_test
- )
- ```
+ ```python
+ dataset_A_train = dict()
+ dataset_B_train = dict()
- If the concatenated dataset is used for test or evaluation, this manner also supports to evaluate each dataset separately.
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train = [
+ dataset_A_train,
+ dataset_B_train
+ ],
+ val = dataset_A_val,
+ test = dataset_A_test
+ )
+ ```
+
+ If the concatenated dataset is used for test or evaluation, this manner also supports to evaluate each dataset separately.
3. We also support to define `ConcatDataset` explicitly as the following.
- ```python
- dataset_A_val = dict()
- dataset_B_val = dict()
+ ```python
+ dataset_A_val = dict()
+ dataset_B_val = dict()
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train=dataset_A_train,
- val=dict(
- type='ConcatDataset',
- datasets=[dataset_A_val, dataset_B_val],
- separate_eval=False))
- ```
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train=dataset_A_train,
+ val=dict(
+ type='ConcatDataset',
+ datasets=[dataset_A_val, dataset_B_val],
+ separate_eval=False))
+ ```
- This manner allows users to evaluate all the datasets as a single one by setting `separate_eval=False`.
+ This manner allows users to evaluate all the datasets as a single one by setting `separate_eval=False`.
**Note:**
@@ -435,7 +435,6 @@ Here you can refer to the setting of the existing datasets. theoretically, `voxe
if the `point_cloud_range` and `voxel_size` are set to be `[0, -40, -3, 70.4, 40, 1]` and `[0.05, 0.05, 0.1]` respectively, then the shape of intermediate feature map should be `[(1-(-3))/0.1+1, (40-(-40))/0.05, (70.4-0)/0.05]=[41, 1600, 1408]`. More details refers to this [issue](https://github.com/open-mmlab/mmdetection3d/issues/382).
-
### Adjust Anchor Range and Size in Config
```python
@@ -450,6 +449,7 @@ anchor_generator=dict(
rotations=[0, 1.57],
reshape_out=False),
```
+
Regarding the setting of `anchor_range`, it is generally adjusted according to dataset. Note that `z` value needs to be adjusted accordingly to the position of the point cloud, please refer to this [issue](https://github.com/open-mmlab/mmdetection3d/issues/986).
Regarding the setting of `anchor_size`, it is usually necessary to count the average length, width and height of the entire training dataset as `anchor_size` to obtain the best results.
diff --git a/docs/en/useful_tools.md b/docs/en/useful_tools.md
index a1a5d85dd5..8f373ae37a 100644
--- a/docs/en/useful_tools.md
+++ b/docs/en/useful_tools.md
@@ -14,26 +14,26 @@ python tools/analysis_tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title
Examples:
-- Plot the classification loss of some run.
+- Plot the classification loss of some run.
- ```shell
- python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
- ```
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
+ ```
-- Plot the classification and regression loss of some run, and save the figure to a pdf.
+- Plot the classification and regression loss of some run, and save the figure to a pdf.
- ```shell
- python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
- ```
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
+ ```
-- Compare the bbox mAP of two runs in the same figure.
+- Compare the bbox mAP of two runs in the same figure.
- ```shell
- # evaluate PartA2 and second on KITTI according to Car_3D_moderate_strict
- python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/PartA2.log.json tools/logs/second.log.json --keys KITTI/Car_3D_moderate_strict --legend PartA2 second --mode eval --interval 1
- # evaluate PointPillars for car and 3 classes on KITTI according to Car_3D_moderate_strict
- python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/pp-3class.log.json tools/logs/pp.log.json --keys KITTI/Car_3D_moderate_strict --legend pp-3class pp --mode eval --interval 2
- ```
+ ```shell
+ # evaluate PartA2 and second on KITTI according to Car_3D_moderate_strict
+ python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/PartA2.log.json tools/logs/second.log.json --keys KITTI/Car_3D_moderate_strict --legend PartA2 second --mode eval --interval 1
+ # evaluate PointPillars for car and 3 classes on KITTI according to Car_3D_moderate_strict
+ python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/pp-3class.log.json tools/logs/pp.log.json --keys KITTI/Car_3D_moderate_strict --legend pp-3class pp --mode eval --interval 2
+ ```
You can also compute the average training speed.
@@ -51,7 +51,7 @@ time std over epochs is 0.0028
average iter time: 1.1959 s/iter
```
-
+
# Visualization
@@ -126,7 +126,7 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py --task

-
+
# Model Serving
@@ -184,7 +184,7 @@ Example:
python tools/deployment/test_torchserver.py demo/data/kitti/kitti_000008.bin configs/second/hv_second_secfpn_6x8_80e_kitti-3d-car.py checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth second
```
-
+
# Model Complexity
@@ -213,7 +213,7 @@ comparisons, but double check it before you adopt it in technical reports or pap
2. Some operators are not counted into FLOPs like GN and custom operators. Refer to [`mmcv.cnn.get_model_complexity_info()`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py) for details.
3. We currently only support FLOPs calculation of single-stage models with single-modality input (point cloud or image). We will support two-stage and multi-modality models in the future.
-
+
# Model Conversion
@@ -258,7 +258,7 @@ python tools/model_converters/publish_model.py work_dirs/faster_rcnn/latest.pth
The final output filename will be `faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth`.
-
+
# Dataset Conversion
@@ -271,15 +271,15 @@ python -u tools/data_converter/nuimage_converter.py --data-root ${DATA_ROOT} --v
--out-dir ${OUT_DIR} --nproc ${NUM_WORKERS} --extra-tag ${TAG}
```
-- `--data-root`: the root of the dataset, defaults to `./data/nuimages`.
-- `--version`: the version of the dataset, defaults to `v1.0-mini`. To get the full dataset, please use `--version v1.0-train v1.0-val v1.0-mini`
-- `--out-dir`: the output directory of annotations and semantic masks, defaults to `./data/nuimages/annotations/`.
-- `--nproc`: number of workers for data preparation, defaults to `4`. Larger number could reduce the preparation time as images are processed in parallel.
-- `--extra-tag`: extra tag of the annotations, defaults to `nuimages`. This can be used to separate different annotations processed in different time for study.
+- `--data-root`: the root of the dataset, defaults to `./data/nuimages`.
+- `--version`: the version of the dataset, defaults to `v1.0-mini`. To get the full dataset, please use `--version v1.0-train v1.0-val v1.0-mini`
+- `--out-dir`: the output directory of annotations and semantic masks, defaults to `./data/nuimages/annotations/`.
+- `--nproc`: number of workers for data preparation, defaults to `4`. Larger number could reduce the preparation time as images are processed in parallel.
+- `--extra-tag`: extra tag of the annotations, defaults to `nuimages`. This can be used to separate different annotations processed in different time for study.
More details could be referred to the [doc](https://mmdetection3d.readthedocs.io/en/latest/data_preparation.html) for dataset preparation and [README](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/nuimages/README.md/) for nuImages dataset.
-
+
# Miscellaneous
diff --git a/docs/zh_cn/1_exist_data_model.md b/docs/zh_cn/1_exist_data_model.md
index aad38540d4..97583539cb 100644
--- a/docs/zh_cn/1_exist_data_model.md
+++ b/docs/zh_cn/1_exist_data_model.md
@@ -32,6 +32,7 @@ python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [-
目前我们只支持 SMOKE 的 CPU 推理测试。
可选参数:
+
- `RESULT_FILE`:输出结果(pickle 格式)的文件名,如果未指定,结果不会被保存。
- `EVAL_METRICS`:在结果上评测的项,不同的数据集有不同的合法值。具体来说,我们默认对不同的数据集都使用各自的官方度量方法进行评测,所以对 nuScenes、Lyft、ScanNet 和 SUNRGBD 这些数据集来说在检测任务上可以简单设置为 `mAP`;对 KITTI 数据集来说,如果我们只想评测 2D 检测效果,可以将度量方法设置为 `img_bbox`;对于 Waymo 数据集,我们提供了 KITTI 风格(不稳定)和 Waymo 官方风格这两种评测方法,分别对应 `kitti` 和 `waymo`,我们推荐使用默认的官方度量方法,它的性能稳定而且可以与其它算法公平比较;同样地,对 S3DIS、ScanNet 这些数据集来说,在分割任务上的度量方法可以设置为 `mIoU`。
- `--show`:如果被指定,检测结果会在静默模式下被保存,用于调试和可视化,但只在单块GPU测试的情况下生效,和 `--show-dir` 搭配使用。
@@ -180,6 +181,7 @@ export CUDA_VISIBLE_DEVICES=-1
- `--options 'Key=value'`:覆盖使用的配置中的一些设定。
`resume-from` 和 `load-from` 的不同点:
+
- `resume-from` 加载模型权重和优化器状态,同时周期数也从特定的模型权重文件中继承,通常用于恢复偶然中断的训练过程。
- `load-from` 仅加载模型权重,训练周期从0开始,通常用于微调。
diff --git a/docs/zh_cn/2_new_data_model.md b/docs/zh_cn/2_new_data_model.md
index 1e3dfaed51..ef62e57313 100644
--- a/docs/zh_cn/2_new_data_model.md
+++ b/docs/zh_cn/2_new_data_model.md
@@ -72,7 +72,6 @@ KITTI 官方提供的目标检测开发[工具包](https://s3.eu-central-1.amazo
更多关于 Waymo 数据集预处理的中间结果的细节,请参照对应的[说明文档](https://mmdetection3d.readthedocs.io/zh_CN/latest/datasets/waymo_det.html)。
-
## 准备配置文件
第二步是准备配置文件来帮助数据集的读取和使用,另外,为了在 3D 检测中获得不错的性能,调整超参数通常是必要的。
diff --git a/docs/zh_cn/benchmarks.md b/docs/zh_cn/benchmarks.md
index ae45a824a4..e9f826ac17 100644
--- a/docs/zh_cn/benchmarks.md
+++ b/docs/zh_cn/benchmarks.md
@@ -1,285 +1,285 @@
-# 基准测试
-
-这里我们对 MMDetection3D 和其他开源 3D 目标检测代码库中模型的训练速度和测试速度进行了基准测试。
-
-## 配置
-
-* 硬件:8 NVIDIA Tesla V100 (32G) GPUs, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
-* 软件:Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.3, numba 0.48.0.
-* 模型:由于不同代码库所实现的模型种类有所不同,在基准测试中我们选择了 SECOND、PointPillars、Part-A2 和 VoteNet 几种模型,分别与其他代码库中的相应模型实现进行了对比。
-* 度量方法:我们使用整个训练过程中的平均吞吐量作为度量方法,并跳过每个 epoch 的前 50 次迭代以消除训练预热的影响。
-
-## 主要结果
-
-对于模型的训练速度(样本/秒),我们将 MMDetection3D 与其他实现了相同模型的代码库进行了对比。结果如下所示,表格内的数字越大,代表模型的训练速度越快。代码库中不支持的模型使用 `×` 进行标识。
-
-| 模型 | MMDetection3D | OpenPCDet | votenet | Det3D |
-| :-----------------: | :-----------: | :-------: | :-----: | :---: |
-| VoteNet | 358 | × | 77 | × |
-| PointPillars-car | 141 | × | × | 140 |
-| PointPillars-3class | 107 | 44 | × | × |
-| SECOND | 40 | 30 | × | × |
-| Part-A2 | 17 | 14 | × | × |
-
-## 测试细节
-
-### 为了计算速度所做的修改
-
-* __MMDetection3D__:我们尝试使用与其他代码库中尽可能相同的配置,具体配置细节见 [基准测试配置](https://github.com/open-mmlab/MMDetection3D/blob/master/configs/benchmark)。
-
-* __Det3D__:为了与 Det3D 进行比较,我们使用了 commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7) 所对应的代码版本。
-
-* __OpenPCDet__:为了与 OpenPCDet 进行比较,我们使用了 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 所对应的代码版本。
-
- 为了计算训练速度,我们在 `./tools/train_utils/train_utils.py` 文件中添加了用于记录运行时间的代码。我们对每个 epoch 的训练速度进行计算,并报告所有 epoch 的平均速度。
-
-
- (为了使用相同方法进行测试所做的具体修改 - 点击展开)
-
-
- ```diff
- diff --git a/tools/train_utils/train_utils.py b/tools/train_utils/train_utils.py
- index 91f21dd..021359d 100644
- --- a/tools/train_utils/train_utils.py
- +++ b/tools/train_utils/train_utils.py
- @@ -2,6 +2,7 @@ import torch
- import os
- import glob
- import tqdm
- +import datetime
- from torch.nn.utils import clip_grad_norm_
-
-
- @@ -13,7 +14,10 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
- if rank == 0:
- pbar = tqdm.tqdm(total=total_it_each_epoch, leave=leave_pbar, desc='train', dynamic_ncols=True)
-
- + start_time = None
- for cur_it in range(total_it_each_epoch):
- + if cur_it > 49 and start_time is None:
- + start_time = datetime.datetime.now()
- try:
- batch = next(dataloader_iter)
- except StopIteration:
- @@ -55,9 +59,11 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
- tb_log.add_scalar('learning_rate', cur_lr, accumulated_iter)
- for key, val in tb_dict.items():
- tb_log.add_scalar('train_' + key, val, accumulated_iter)
- + endtime = datetime.datetime.now()
- + speed = (endtime - start_time).seconds / (total_it_each_epoch - 50)
- if rank == 0:
- pbar.close()
- - return accumulated_iter
- + return accumulated_iter, speed
-
-
- def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_cfg,
- @@ -65,6 +71,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- lr_warmup_scheduler=None, ckpt_save_interval=1, max_ckpt_save_num=50,
- merge_all_iters_to_one_epoch=False):
- accumulated_iter = start_iter
- + speeds = []
- with tqdm.trange(start_epoch, total_epochs, desc='epochs', dynamic_ncols=True, leave=(rank == 0)) as tbar:
- total_it_each_epoch = len(train_loader)
- if merge_all_iters_to_one_epoch:
- @@ -82,7 +89,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- cur_scheduler = lr_warmup_scheduler
- else:
- cur_scheduler = lr_scheduler
- - accumulated_iter = train_one_epoch(
- + accumulated_iter, speed = train_one_epoch(
- model, optimizer, train_loader, model_func,
- lr_scheduler=cur_scheduler,
- accumulated_iter=accumulated_iter, optim_cfg=optim_cfg,
- @@ -91,7 +98,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- total_it_each_epoch=total_it_each_epoch,
- dataloader_iter=dataloader_iter
- )
- -
- + speeds.append(speed)
- # save trained model
- trained_epoch = cur_epoch + 1
- if trained_epoch % ckpt_save_interval == 0 and rank == 0:
- @@ -107,6 +114,8 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
- save_checkpoint(
- checkpoint_state(model, optimizer, trained_epoch, accumulated_iter), filename=ckpt_name,
- )
- + print(speed)
- + print(f'*******{sum(speeds) / len(speeds)}******')
-
-
- def model_state_to_cpu(model_state):
- ```
-
-
-
-### VoteNet
-
-* __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
-
- ```bash
- ./tools/dist_train.sh configs/votenet/votenet_16x8_sunrgbd-3d-10class.py 8 --no-validate
- ```
-
-* __votenet__:在 commit [2f6d6d3](https://github.com/facebookresearch/votenet/tree/2f6d6d36ff98d96901182e935afe48ccee82d566) 版本下,执行如下命令:
-
- ```bash
- python train.py --dataset sunrgbd --batch_size 16
- ```
-
-
- 然后执行如下命令,对测试速度进行评估:
-
- ```bash
- python eval.py --dataset sunrgbd --checkpoint_path log_sunrgbd/checkpoint.tar --batch_size 1 --dump_dir eval_sunrgbd --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
- ```
-
- 注意,为了计算推理速度,我们对 `eval.py` 进行了修改。
-
-
-
- (为了对相同模型进行测试所做的具体修改 - 点击展开)
-
-
- ```diff
- diff --git a/eval.py b/eval.py
- index c0b2886..04921e9 100644
- --- a/eval.py
- +++ b/eval.py
- @@ -10,6 +10,7 @@ import os
- import sys
- import numpy as np
- from datetime import datetime
- +import time
- import argparse
- import importlib
- import torch
- @@ -28,7 +29,7 @@ parser.add_argument('--checkpoint_path', default=None, help='Model checkpoint pa
- parser.add_argument('--dump_dir', default=None, help='Dump dir to save sample outputs [default: None]')
- parser.add_argument('--num_point', type=int, default=20000, help='Point Number [default: 20000]')
- parser.add_argument('--num_target', type=int, default=256, help='Point Number [default: 256]')
- -parser.add_argument('--batch_size', type=int, default=8, help='Batch Size during training [default: 8]')
- +parser.add_argument('--batch_size', type=int, default=1, help='Batch Size during training [default: 8]')
- parser.add_argument('--vote_factor', type=int, default=1, help='Number of votes generated from each seed [default: 1]')
- parser.add_argument('--cluster_sampling', default='vote_fps', help='Sampling strategy for vote clusters: vote_fps, seed_fps, random [default: vote_fps]')
- parser.add_argument('--ap_iou_thresholds', default='0.25,0.5', help='A list of AP IoU thresholds [default: 0.25,0.5]')
- @@ -132,6 +133,7 @@ CONFIG_DICT = {'remove_empty_box': (not FLAGS.faster_eval), 'use_3d_nms': FLAGS.
- # ------------------------------------------------------------------------- GLOBAL CONFIG END
-
- def evaluate_one_epoch():
- + time_list = list()
- stat_dict = {}
- ap_calculator_list = [APCalculator(iou_thresh, DATASET_CONFIG.class2type) \
- for iou_thresh in AP_IOU_THRESHOLDS]
- @@ -144,6 +146,8 @@ def evaluate_one_epoch():
-
- # Forward pass
- inputs = {'point_clouds': batch_data_label['point_clouds']}
- + torch.cuda.synchronize()
- + start_time = time.perf_counter()
- with torch.no_grad():
- end_points = net(inputs)
-
- @@ -161,6 +165,12 @@ def evaluate_one_epoch():
-
- batch_pred_map_cls = parse_predictions(end_points, CONFIG_DICT)
- batch_gt_map_cls = parse_groundtruths(end_points, CONFIG_DICT)
- + torch.cuda.synchronize()
- + elapsed = time.perf_counter() - start_time
- + time_list.append(elapsed)
- +
- + if len(time_list==200):
- + print("average inference time: %4f"%(sum(time_list[5:])/len(time_list[5:])))
- for ap_calculator in ap_calculator_list:
- ap_calculator.step(batch_pred_map_cls, batch_gt_map_cls)
-
- ```
-
-### PointPillars-car
-
-* __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
-
- ```bash
- ./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_3x8_100e_det3d_kitti-3d-car.py 8 --no-validate
- ```
-
-* __Det3D__:在 commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7) 版本下,使用 `kitti_point_pillars_mghead_syncbn.py` 并执行如下命令:
-
- ```bash
- ./tools/scripts/train.sh --launcher=slurm --gpus=8
- ```
-
- 注意,为了训练 PointPillars,我们对 `train.sh` 进行了修改。
-
-
-
- (为了对相同模型进行测试所做的具体修改 - 点击展开)
-
-
- ```diff
- diff --git a/tools/scripts/train.sh b/tools/scripts/train.sh
- index 3a93f95..461e0ea 100755
- --- a/tools/scripts/train.sh
- +++ b/tools/scripts/train.sh
- @@ -16,9 +16,9 @@ then
- fi
-
- # Voxelnet
- -python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ kitti_car_vfev3_spmiddlefhd_rpn1_mghead_syncbn.py --work_dir=$SECOND_WORK_DIR
- +# python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ kitti_car_vfev3_spmiddlefhd_rpn1_mghead_syncbn.py --work_dir=$SECOND_WORK_DIR
- # python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/cbgs/configs/ nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=$NUSC_CBGS_WORK_DIR
- # python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ lyft_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=$LYFT_CBGS_WORK_DIR
-
- # PointPillars
- -# python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py ./examples/point_pillars/configs/ original_pp_mghead_syncbn_kitti.py --work_dir=$PP_WORK_DIR
- +python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py ./examples/point_pillars/configs/ kitti_point_pillars_mghead_syncbn.py
- ```
-
-
-
-### PointPillars-3class
-
-* __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
-
- ```bash
- ./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
- ```
-
-* __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令:
-
- ```bash
- cd tools
- sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/pointpillar.yaml --batch_size 32 --workers 32 --epochs 80
- ```
-
-### SECOND
-
-基准测试中的 SECOND 指在 [second.Pytorch](https://github.com/traveller59/second.pytorch) 首次被实现的 [SECONDv1.5](https://github.com/traveller59/second.pytorch/blob/master/second/configs/all.fhd.config)。Det3D 实现的 SECOND 中,使用了自己实现的 Multi-Group Head,因此无法将它的速度与其他代码库进行对比。
-
-* __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
-
- ```bash
- ./tools/dist_train.sh configs/benchmark/hv_second_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
- ```
-
-* __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令:
-
- ```bash
- cd tools
- sh ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/second.yaml --batch_size 32 --workers 32 --epochs 80
- ```
-
-### Part-A2
-
-* __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
-
- ```bash
- ./tools/dist_train.sh configs/benchmark/hv_PartA2_secfpn_4x8_cyclic_80e_pcdet_kitti-3d-3class.py 8 --no-validate
- ```
-
-* __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令以进行模型训练:
-
- ```bash
- cd tools
- sh ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/PartA2.yaml --batch_size 32 --workers 32 --epochs 80
- ```
\ No newline at end of file
+# 基准测试
+
+这里我们对 MMDetection3D 和其他开源 3D 目标检测代码库中模型的训练速度和测试速度进行了基准测试。
+
+## 配置
+
+- 硬件:8 NVIDIA Tesla V100 (32G) GPUs, Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
+- 软件:Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.3, numba 0.48.0.
+- 模型:由于不同代码库所实现的模型种类有所不同,在基准测试中我们选择了 SECOND、PointPillars、Part-A2 和 VoteNet 几种模型,分别与其他代码库中的相应模型实现进行了对比。
+- 度量方法:我们使用整个训练过程中的平均吞吐量作为度量方法,并跳过每个 epoch 的前 50 次迭代以消除训练预热的影响。
+
+## 主要结果
+
+对于模型的训练速度(样本/秒),我们将 MMDetection3D 与其他实现了相同模型的代码库进行了对比。结果如下所示,表格内的数字越大,代表模型的训练速度越快。代码库中不支持的模型使用 `×` 进行标识。
+
+| 模型 | MMDetection3D | OpenPCDet | votenet | Det3D |
+| :-----------------: | :-----------: | :-------: | :-----: | :---: |
+| VoteNet | 358 | × | 77 | × |
+| PointPillars-car | 141 | × | × | 140 |
+| PointPillars-3class | 107 | 44 | × | × |
+| SECOND | 40 | 30 | × | × |
+| Part-A2 | 17 | 14 | × | × |
+
+## 测试细节
+
+### 为了计算速度所做的修改
+
+- __MMDetection3D__:我们尝试使用与其他代码库中尽可能相同的配置,具体配置细节见 [基准测试配置](https://github.com/open-mmlab/MMDetection3D/blob/master/configs/benchmark)。
+
+- __Det3D__:为了与 Det3D 进行比较,我们使用了 commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7) 所对应的代码版本。
+
+- __OpenPCDet__:为了与 OpenPCDet 进行比较,我们使用了 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 所对应的代码版本。
+
+ 为了计算训练速度,我们在 `./tools/train_utils/train_utils.py` 文件中添加了用于记录运行时间的代码。我们对每个 epoch 的训练速度进行计算,并报告所有 epoch 的平均速度。
+
+
+
+ (为了使用相同方法进行测试所做的具体修改 - 点击展开)
+
+
+ ```diff
+ diff --git a/tools/train_utils/train_utils.py b/tools/train_utils/train_utils.py
+ index 91f21dd..021359d 100644
+ --- a/tools/train_utils/train_utils.py
+ +++ b/tools/train_utils/train_utils.py
+ @@ -2,6 +2,7 @@ import torch
+ import os
+ import glob
+ import tqdm
+ +import datetime
+ from torch.nn.utils import clip_grad_norm_
+
+
+ @@ -13,7 +14,10 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
+ if rank == 0:
+ pbar = tqdm.tqdm(total=total_it_each_epoch, leave=leave_pbar, desc='train', dynamic_ncols=True)
+
+ + start_time = None
+ for cur_it in range(total_it_each_epoch):
+ + if cur_it > 49 and start_time is None:
+ + start_time = datetime.datetime.now()
+ try:
+ batch = next(dataloader_iter)
+ except StopIteration:
+ @@ -55,9 +59,11 @@ def train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, ac
+ tb_log.add_scalar('learning_rate', cur_lr, accumulated_iter)
+ for key, val in tb_dict.items():
+ tb_log.add_scalar('train_' + key, val, accumulated_iter)
+ + endtime = datetime.datetime.now()
+ + speed = (endtime - start_time).seconds / (total_it_each_epoch - 50)
+ if rank == 0:
+ pbar.close()
+ - return accumulated_iter
+ + return accumulated_iter, speed
+
+
+ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_cfg,
+ @@ -65,6 +71,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ lr_warmup_scheduler=None, ckpt_save_interval=1, max_ckpt_save_num=50,
+ merge_all_iters_to_one_epoch=False):
+ accumulated_iter = start_iter
+ + speeds = []
+ with tqdm.trange(start_epoch, total_epochs, desc='epochs', dynamic_ncols=True, leave=(rank == 0)) as tbar:
+ total_it_each_epoch = len(train_loader)
+ if merge_all_iters_to_one_epoch:
+ @@ -82,7 +89,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ cur_scheduler = lr_warmup_scheduler
+ else:
+ cur_scheduler = lr_scheduler
+ - accumulated_iter = train_one_epoch(
+ + accumulated_iter, speed = train_one_epoch(
+ model, optimizer, train_loader, model_func,
+ lr_scheduler=cur_scheduler,
+ accumulated_iter=accumulated_iter, optim_cfg=optim_cfg,
+ @@ -91,7 +98,7 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ total_it_each_epoch=total_it_each_epoch,
+ dataloader_iter=dataloader_iter
+ )
+ -
+ + speeds.append(speed)
+ # save trained model
+ trained_epoch = cur_epoch + 1
+ if trained_epoch % ckpt_save_interval == 0 and rank == 0:
+ @@ -107,6 +114,8 @@ def train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_
+ save_checkpoint(
+ checkpoint_state(model, optimizer, trained_epoch, accumulated_iter), filename=ckpt_name,
+ )
+ + print(speed)
+ + print(f'*******{sum(speeds) / len(speeds)}******')
+
+
+ def model_state_to_cpu(model_state):
+ ```
+
+
+
+### VoteNet
+
+- __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
+
+ ```bash
+ ./tools/dist_train.sh configs/votenet/votenet_16x8_sunrgbd-3d-10class.py 8 --no-validate
+ ```
+
+- __votenet__:在 commit [2f6d6d3](https://github.com/facebookresearch/votenet/tree/2f6d6d36ff98d96901182e935afe48ccee82d566) 版本下,执行如下命令:
+
+ ```bash
+ python train.py --dataset sunrgbd --batch_size 16
+ ```
+
+ 然后执行如下命令,对测试速度进行评估:
+
+ ```bash
+ python eval.py --dataset sunrgbd --checkpoint_path log_sunrgbd/checkpoint.tar --batch_size 1 --dump_dir eval_sunrgbd --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
+ ```
+
+ 注意,为了计算推理速度,我们对 `eval.py` 进行了修改。
+
+
+
+ (为了对相同模型进行测试所做的具体修改 - 点击展开)
+
+
+ ```diff
+ diff --git a/eval.py b/eval.py
+ index c0b2886..04921e9 100644
+ --- a/eval.py
+ +++ b/eval.py
+ @@ -10,6 +10,7 @@ import os
+ import sys
+ import numpy as np
+ from datetime import datetime
+ +import time
+ import argparse
+ import importlib
+ import torch
+ @@ -28,7 +29,7 @@ parser.add_argument('--checkpoint_path', default=None, help='Model checkpoint pa
+ parser.add_argument('--dump_dir', default=None, help='Dump dir to save sample outputs [default: None]')
+ parser.add_argument('--num_point', type=int, default=20000, help='Point Number [default: 20000]')
+ parser.add_argument('--num_target', type=int, default=256, help='Point Number [default: 256]')
+ -parser.add_argument('--batch_size', type=int, default=8, help='Batch Size during training [default: 8]')
+ +parser.add_argument('--batch_size', type=int, default=1, help='Batch Size during training [default: 8]')
+ parser.add_argument('--vote_factor', type=int, default=1, help='Number of votes generated from each seed [default: 1]')
+ parser.add_argument('--cluster_sampling', default='vote_fps', help='Sampling strategy for vote clusters: vote_fps, seed_fps, random [default: vote_fps]')
+ parser.add_argument('--ap_iou_thresholds', default='0.25,0.5', help='A list of AP IoU thresholds [default: 0.25,0.5]')
+ @@ -132,6 +133,7 @@ CONFIG_DICT = {'remove_empty_box': (not FLAGS.faster_eval), 'use_3d_nms': FLAGS.
+ # ------------------------------------------------------------------------- GLOBAL CONFIG END
+
+ def evaluate_one_epoch():
+ + time_list = list()
+ stat_dict = {}
+ ap_calculator_list = [APCalculator(iou_thresh, DATASET_CONFIG.class2type) \
+ for iou_thresh in AP_IOU_THRESHOLDS]
+ @@ -144,6 +146,8 @@ def evaluate_one_epoch():
+
+ # Forward pass
+ inputs = {'point_clouds': batch_data_label['point_clouds']}
+ + torch.cuda.synchronize()
+ + start_time = time.perf_counter()
+ with torch.no_grad():
+ end_points = net(inputs)
+
+ @@ -161,6 +165,12 @@ def evaluate_one_epoch():
+
+ batch_pred_map_cls = parse_predictions(end_points, CONFIG_DICT)
+ batch_gt_map_cls = parse_groundtruths(end_points, CONFIG_DICT)
+ + torch.cuda.synchronize()
+ + elapsed = time.perf_counter() - start_time
+ + time_list.append(elapsed)
+ +
+ + if len(time_list==200):
+ + print("average inference time: %4f"%(sum(time_list[5:])/len(time_list[5:])))
+ for ap_calculator in ap_calculator_list:
+ ap_calculator.step(batch_pred_map_cls, batch_gt_map_cls)
+
+ ```
+
+### PointPillars-car
+
+- __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
+
+ ```bash
+ ./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_3x8_100e_det3d_kitti-3d-car.py 8 --no-validate
+ ```
+
+- __Det3D__:在 commit [519251e](https://github.com/poodarchu/Det3D/tree/519251e72a5c1fdd58972eabeac67808676b9bb7) 版本下,使用 `kitti_point_pillars_mghead_syncbn.py` 并执行如下命令:
+
+ ```bash
+ ./tools/scripts/train.sh --launcher=slurm --gpus=8
+ ```
+
+ 注意,为了训练 PointPillars,我们对 `train.sh` 进行了修改。
+
+
+
+ (为了对相同模型进行测试所做的具体修改 - 点击展开)
+
+
+ ```diff
+ diff --git a/tools/scripts/train.sh b/tools/scripts/train.sh
+ index 3a93f95..461e0ea 100755
+ --- a/tools/scripts/train.sh
+ +++ b/tools/scripts/train.sh
+ @@ -16,9 +16,9 @@ then
+ fi
+
+ # Voxelnet
+ -python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ kitti_car_vfev3_spmiddlefhd_rpn1_mghead_syncbn.py --work_dir=$SECOND_WORK_DIR
+ +# python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ kitti_car_vfev3_spmiddlefhd_rpn1_mghead_syncbn.py --work_dir=$SECOND_WORK_DIR
+ # python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/cbgs/configs/ nusc_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=$NUSC_CBGS_WORK_DIR
+ # python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py examples/second/configs/ lyft_all_vfev3_spmiddleresnetfhd_rpn2_mghead_syncbn.py --work_dir=$LYFT_CBGS_WORK_DIR
+
+ # PointPillars
+ -# python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py ./examples/point_pillars/configs/ original_pp_mghead_syncbn_kitti.py --work_dir=$PP_WORK_DIR
+ +python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py ./examples/point_pillars/configs/ kitti_point_pillars_mghead_syncbn.py
+ ```
+
+
+
+### PointPillars-3class
+
+- __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
+
+ ```bash
+ ./tools/dist_train.sh configs/benchmark/hv_pointpillars_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
+ ```
+
+- __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令:
+
+ ```bash
+ cd tools
+ sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/pointpillar.yaml --batch_size 32 --workers 32 --epochs 80
+ ```
+
+### SECOND
+
+基准测试中的 SECOND 指在 [second.Pytorch](https://github.com/traveller59/second.pytorch) 首次被实现的 [SECONDv1.5](https://github.com/traveller59/second.pytorch/blob/master/second/configs/all.fhd.config)。Det3D 实现的 SECOND 中,使用了自己实现的 Multi-Group Head,因此无法将它的速度与其他代码库进行对比。
+
+- __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
+
+ ```bash
+ ./tools/dist_train.sh configs/benchmark/hv_second_secfpn_4x8_80e_pcdet_kitti-3d-3class.py 8 --no-validate
+ ```
+
+- __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令:
+
+ ```bash
+ cd tools
+ sh ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/second.yaml --batch_size 32 --workers 32 --epochs 80
+ ```
+
+### Part-A2
+
+- __MMDetection3D__:在 v0.1.0 版本下, 执行如下命令:
+
+ ```bash
+ ./tools/dist_train.sh configs/benchmark/hv_PartA2_secfpn_4x8_cyclic_80e_pcdet_kitti-3d-3class.py 8 --no-validate
+ ```
+
+- __OpenPCDet__:在 commit [b32fbddb](https://github.com/open-mmlab/OpenPCDet/tree/b32fbddbe06183507bad433ed99b407cbc2175c2) 版本下,执行如下命令以进行模型训练:
+
+ ```bash
+ cd tools
+ sh ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} 8 --cfg_file ./cfgs/kitti_models/PartA2.yaml --batch_size 32 --workers 32 --epochs 80
+ ```
diff --git a/docs/zh_cn/datasets/kitti_det.md b/docs/zh_cn/datasets/kitti_det.md
index 6ab07696aa..01a2421fe4 100644
--- a/docs/zh_cn/datasets/kitti_det.md
+++ b/docs/zh_cn/datasets/kitti_det.md
@@ -88,27 +88,27 @@ kitti
- `kitti_gt_database/xxxxx.bin`: 训练数据集中包含在 3D 标注框中的点云数据
- `kitti_infos_train.pkl`:训练数据集的信息,其中每一帧的信息包含下面的内容:
- - info['point_cloud']: {'num_features': 4, 'velodyne_path': velodyne_path}.
- - info['annos']: {
- - 位置:其中 x,y,z 为相机参考坐标系下的目标的底部中心(单位为米),是一个尺寸为 Nx3 的数组
- - 维度: 目标的高、宽、长(单位为米),是一个尺寸为 Nx3 的数组
- - 旋转角:相机坐标系下目标绕着 Y 轴的旋转角 ry,其取值范围为 [-pi..pi] ,是一个尺寸为 N 的数组
- - 名称:标准框所包含的目标的名称,是一个尺寸为 N 的数组
- - 困难度:kitti 官方所定义的困难度,包括 简单,适中,困难
- - 组别标识符:用于多部件的目标
- }
- - (optional) info['calib']: {
- - P0:校对后的 camera0 投影矩阵,是一个 3x4 数组
- - P1:校对后的 camera1 投影矩阵,是一个 3x4 数组
- - P2:校对后的 camera2 投影矩阵,是一个 3x4 数组
- - P3:校对后的 camera3 投影矩阵,是一个 3x4 数组
- - R0_rect:校准旋转矩阵,是一个 4x4 数组
- - Tr_velo_to_cam:从 Velodyne 坐标到相机坐标的变换矩阵,是一个 4x4 数组
- - Tr_imu_to_velo:从 IMU 坐标到 Velodyne 坐标的变换矩阵,是一个 4x4 数组
- }
- - (optional) info['image']:{'image_idx': idx, 'image_path': image_path, 'image_shape', image_shape}.
-
-**注意**:其中的 info['annos'] 中的数据均位于相机参考坐标系中,更多的细节请参考[此处](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)。
+ - info\['point_cloud'\]: {'num_features': 4, 'velodyne_path': velodyne_path}.
+ - info\['annos'\]: {
+ - 位置:其中 x,y,z 为相机参考坐标系下的目标的底部中心(单位为米),是一个尺寸为 Nx3 的数组
+ - 维度: 目标的高、宽、长(单位为米),是一个尺寸为 Nx3 的数组
+ - 旋转角:相机坐标系下目标绕着 Y 轴的旋转角 ry,其取值范围为 \[-pi..pi\] ,是一个尺寸为 N 的数组
+ - 名称:标准框所包含的目标的名称,是一个尺寸为 N 的数组
+ - 困难度:kitti 官方所定义的困难度,包括 简单,适中,困难
+ - 组别标识符:用于多部件的目标
+ }
+ - (optional) info\['calib'\]: {
+ - P0:校对后的 camera0 投影矩阵,是一个 3x4 数组
+ - P1:校对后的 camera1 投影矩阵,是一个 3x4 数组
+ - P2:校对后的 camera2 投影矩阵,是一个 3x4 数组
+ - P3:校对后的 camera3 投影矩阵,是一个 3x4 数组
+ - R0_rect:校准旋转矩阵,是一个 4x4 数组
+ - Tr_velo_to_cam:从 Velodyne 坐标到相机坐标的变换矩阵,是一个 4x4 数组
+ - Tr_imu_to_velo:从 IMU 坐标到 Velodyne 坐标的变换矩阵,是一个 4x4 数组
+ }
+ - (optional) info\['image'\]:{'image_idx': idx, 'image_path': image_path, 'image_shape', image_shape}.
+
+**注意**:其中的 info\['annos'\] 中的数据均位于相机参考坐标系中,更多的细节请参考[此处](http://www.cvlibs.net/publications/Geiger2013IJRR.pdf)。
获取 kitti_infos_xxx.pkl 和 kitti_infos_xxx_mono3d.coco.json 的核心函数分别为 [get_kitti_image_info](https://github.com/open-mmlab/mmdetection3d/blob/7873c8f62b99314f35079f369d1dab8d63f8a3ce/tools/data_converter/kitti_data_utils.py#L140) 和 [get_2d_boxes](https://github.com/open-mmlab/mmdetection3d/blob/7873c8f62b99314f35079f369d1dab8d63f8a3ce/tools/data_converter/kitti_converter.py#L378).
@@ -150,9 +150,9 @@ train_pipeline = [
```
- 数据增强:
- - `ObjectNoise`:对场景中的每个真实标注框目标添加噪音。
- - `RandomFlip3D`:对输入点云数据进行随机地水平翻转或者垂直翻转。
- - `GlobalRotScaleTrans`:对输入点云数据进行旋转。
+ - `ObjectNoise`:对场景中的每个真实标注框目标添加噪音。
+ - `RandomFlip3D`:对输入点云数据进行随机地水平翻转或者垂直翻转。
+ - `GlobalRotScaleTrans`:对输入点云数据进行旋转。
## 评估
@@ -191,4 +191,4 @@ mkdir -p results/kitti-3class
./tools/dist_test.sh configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py work_dirs/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class/latest.pth 8 --out results/kitti-3class/results_eval.pkl --format-only --eval-options 'pklfile_prefix=results/kitti-3class/kitti_results' 'submission_prefix=results/kitti-3class/kitti_results'
```
-在生成 `results/kitti-3class/kitti_results/xxxxx.txt` 后,您可以提交这些文件到 KITTI 官方网站进行基准测试,请参考 [KITTI 官方网站]((http://www.cvlibs.net/datasets/kitti/index.php))获取更多细节。
+在生成 `results/kitti-3class/kitti_results/xxxxx.txt` 后,您可以提交这些文件到 KITTI 官方网站进行基准测试,请参考 [KITTI 官方网站](<(http://www.cvlibs.net/datasets/kitti/index.php)>)获取更多细节。
diff --git a/docs/zh_cn/datasets/lyft_det.md b/docs/zh_cn/datasets/lyft_det.md
index f2693798dc..f02e792825 100644
--- a/docs/zh_cn/datasets/lyft_det.md
+++ b/docs/zh_cn/datasets/lyft_det.md
@@ -89,21 +89,21 @@ mmdetection3d
- `lyft_database/xxxxx.bin` 文件不存在:由于真实标注框的采样对实验的影响可以忽略不计,在 Lyft 数据集中不会提取该目录和相关的 `.bin` 文件。
- `lyft_infos_train.pkl`:包含训练数据集信息,每一帧包含两个关键字:`metadata` 和 `infos`。
-`metadata` 包含数据集自身的基础信息,如 `{'version': 'v1.01-train'}`,然而 `infos` 包含和 nuScenes 数据集相似的数据集详细信息,但是并不包含一下几点:
- - info['sweeps']:扫描信息.
- - info['sweeps'][i]['type']:扫描信息的数据类型,如 `'lidar'`。
- Lyft 数据集中的一些样例具有不同的 LiDAR 设置,然而为了数据分布的一致性,这里将一直采用顶部的 LiDAR 设备所采集的数据点信息。
- - info['gt_names']:在 Lyft 数据集中有 9 个类别,相比于 nuScenes 数据集,不同类别的标注不平衡问题更加突出。
- - info['gt_velocity'] 不存在:Lyft 数据集中不存在速度评估信息。
- - info['num_lidar_pts']:默认值设置为 -1。
- - info['num_radar_pts']:默认值设置为 0。
- - info['valid_flag'] 不存在:这个标志信息因无效的 `num_lidar_pts` 和 `num_radar_pts` 的存在而存在。
+ `metadata` 包含数据集自身的基础信息,如 `{'version': 'v1.01-train'}`,然而 `infos` 包含和 nuScenes 数据集相似的数据集详细信息,但是并不包含一下几点:
+ - info\['sweeps'\]:扫描信息.
+ - info\['sweeps'\]\[i\]\['type'\]:扫描信息的数据类型,如 `'lidar'`。
+ Lyft 数据集中的一些样例具有不同的 LiDAR 设置,然而为了数据分布的一致性,这里将一直采用顶部的 LiDAR 设备所采集的数据点信息。
+ - info\['gt_names'\]:在 Lyft 数据集中有 9 个类别,相比于 nuScenes 数据集,不同类别的标注不平衡问题更加突出。
+ - info\['gt_velocity'\] 不存在:Lyft 数据集中不存在速度评估信息。
+ - info\['num_lidar_pts'\]:默认值设置为 -1。
+ - info\['num_radar_pts'\]:默认值设置为 0。
+ - info\['valid_flag'\] 不存在:这个标志信息因无效的 `num_lidar_pts` 和 `num_radar_pts` 的存在而存在。
- `nuscenes_infos_train_mono3d.coco.json`:包含 coco 类型的训练数据集相关的信息。这个文件仅包含 2D 相关的信息,不包含 3D 目标检测所需要的信息,如相机内参。
- - info['images']:包含所有图像信息的列表。
- - 仅包含 `'file_name'`, `'id'`, `'width'`, `'height'`。
- - info['annotations']:包含所有标注信息的列表。
- - 仅包含 `'file_name'`,`'image_id'`,`'area'`,`'category_name'`,`'category_id'`,`'bbox'`,`'is_crowd'`,`'segmentation'`,`'id'`,其中 `'is_crowd'` 和 `'segmentation'` 默认设置为 `0` 和 `[]`。
- Lyft 数据集中不包含属性标注信息。
+ - info\['images'\]:包含所有图像信息的列表。
+ - 仅包含 `'file_name'`, `'id'`, `'width'`, `'height'`。
+ - info\['annotations'\]:包含所有标注信息的列表。
+ - 仅包含 `'file_name'`,`'image_id'`,`'area'`,`'category_name'`,`'category_id'`,`'bbox'`,`'is_crowd'`,`'segmentation'`,`'id'`,其中 `'is_crowd'` 和 `'segmentation'` 默认设置为 `0` 和 `[]`。
+ Lyft 数据集中不包含属性标注信息。
这里仅介绍存储在训练数据文件的数据记录信息,在测试数据集也采用上述的数据记录方式。
diff --git a/docs/zh_cn/datasets/nuscenes_det.md b/docs/zh_cn/datasets/nuscenes_det.md
index 18b426bd3d..6bc054ea48 100644
--- a/docs/zh_cn/datasets/nuscenes_det.md
+++ b/docs/zh_cn/datasets/nuscenes_det.md
@@ -64,60 +64,60 @@ mmdetection3d
- `nuscenes_database/xxxxx.bin`:训练数据集的每个 3D 包围框中包含的点云数据。
- `nuscenes_infos_train.pkl`:训练数据集信息,每帧信息有两个键值: `metadata` 和 `infos`。 `metadata` 包含数据集本身的基本信息,例如 `{'version': 'v1.0-trainval'}`,而 `infos` 包含详细信息如下:
- - info['lidar_path']:激光雷达点云数据的文件路径。
- - info['token']:样本数据标记。
- - info['sweeps']:扫描信息(nuScenes 中的 `sweeps` 是指没有标注的中间帧,而 `samples` 是指那些带有标注的关键帧)。
- - info['sweeps'][i]['data_path']:第 i 次扫描的数据路径。
- - info['sweeps'][i]['type']:扫描数据类型,例如“激光雷达”。
- - info['sweeps'][i]['sample_data_token']:扫描样本数据标记。
- - info['sweeps'][i]['sensor2ego_translation']:从当前传感器(用于收集扫描数据)到自车(包含感知周围环境传感器的车辆,车辆坐标系固连在自车上)的转换(1x3 列表)。
- - info['sweeps'][i]['sensor2ego_rotation']:从当前传感器(用于收集扫描数据)到自车的旋转(四元数格式的 1x4 列表)。
- - info['sweeps'][i]['ego2global_translation']:从自车到全局坐标的转换(1x3 列表)。
- - info['sweeps'][i]['ego2global_rotation']:从自车到全局坐标的旋转(四元数格式的 1x4 列表)。
- - info['sweeps'][i]['timestamp']:扫描数据的时间戳。
- - info['sweeps'][i]['sensor2lidar_translation']:从当前传感器(用于收集扫描数据)到激光雷达的转换(1x3 列表)。
- - info['sweeps'][i]['sensor2lidar_rotation']:从当前传感器(用于收集扫描数据)到激光雷达的旋转(四元数格式的 1x4 列表)。
- - info['cams']:相机校准信息。它包含与每个摄像头对应的六个键值: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。
+ - info\['lidar_path'\]:激光雷达点云数据的文件路径。
+ - info\['token'\]:样本数据标记。
+ - info\['sweeps'\]:扫描信息(nuScenes 中的 `sweeps` 是指没有标注的中间帧,而 `samples` 是指那些带有标注的关键帧)。
+ - info\['sweeps'\]\[i\]\['data_path'\]:第 i 次扫描的数据路径。
+ - info\['sweeps'\]\[i\]\['type'\]:扫描数据类型,例如“激光雷达”。
+ - info\['sweeps'\]\[i\]\['sample_data_token'\]:扫描样本数据标记。
+ - info\['sweeps'\]\[i\]\['sensor2ego_translation'\]:从当前传感器(用于收集扫描数据)到自车(包含感知周围环境传感器的车辆,车辆坐标系固连在自车上)的转换(1x3 列表)。
+ - info\['sweeps'\]\[i\]\['sensor2ego_rotation'\]:从当前传感器(用于收集扫描数据)到自车的旋转(四元数格式的 1x4 列表)。
+ - info\['sweeps'\]\[i\]\['ego2global_translation'\]:从自车到全局坐标的转换(1x3 列表)。
+ - info\['sweeps'\]\[i\]\['ego2global_rotation'\]:从自车到全局坐标的旋转(四元数格式的 1x4 列表)。
+ - info\['sweeps'\]\[i\]\['timestamp'\]:扫描数据的时间戳。
+ - info\['sweeps'\]\[i\]\['sensor2lidar_translation'\]:从当前传感器(用于收集扫描数据)到激光雷达的转换(1x3 列表)。
+ - info\['sweeps'\]\[i\]\['sensor2lidar_rotation'\]:从当前传感器(用于收集扫描数据)到激光雷达的旋转(四元数格式的 1x4 列表)。
+ - info\['cams'\]:相机校准信息。它包含与每个摄像头对应的六个键值: `'CAM_FRONT'`, `'CAM_FRONT_RIGHT'`, `'CAM_FRONT_LEFT'`, `'CAM_BACK'`, `'CAM_BACK_LEFT'`, `'CAM_BACK_RIGHT'`。
每个字典包含每个扫描数据按照上述方式的详细信息(每个信息的关键字与上述相同)。除此之外,每个相机还包含了一个键值 `'cam_intrinsic'` 用来保存 3D 点投影到图像平面上需要的内参信息。
- - info['lidar2ego_translation']:从激光雷达到自车的转换(1x3 列表)。
- - info['lidar2ego_rotation']:从激光雷达到自车的旋转(四元数格式的 1x4 列表)。
- - info['ego2global_translation']:从自车到全局坐标的转换(1x3 列表)。
- - info['ego2global_rotation']:从自我车辆到全局坐标的旋转(四元数格式的 1x4 列表)。
- - info['timestamp']:样本数据的时间戳。
- - info['gt_boxes']:7 个自由度的 3D 包围框,一个 Nx7 数组。
- - info['gt_names']:3D 包围框的类别,一个 1xN 数组。
- - info['gt_velocity']:3D 包围框的速度(由于不准确,没有垂直测量),一个 Nx2 数组。
- - info['num_lidar_pts']:每个 3D 包围框中包含的激光雷达点数。
- - info['num_radar_pts']:每个 3D 包围框中包含的雷达点数。
- - info['valid_flag']:每个包围框是否有效。一般情况下,我们只将包含至少一个激光雷达或雷达点的 3D 框作为有效框。
+ - info\['lidar2ego_translation'\]:从激光雷达到自车的转换(1x3 列表)。
+ - info\['lidar2ego_rotation'\]:从激光雷达到自车的旋转(四元数格式的 1x4 列表)。
+ - info\['ego2global_translation'\]:从自车到全局坐标的转换(1x3 列表)。
+ - info\['ego2global_rotation'\]:从自我车辆到全局坐标的旋转(四元数格式的 1x4 列表)。
+ - info\['timestamp'\]:样本数据的时间戳。
+ - info\['gt_boxes'\]:7 个自由度的 3D 包围框,一个 Nx7 数组。
+ - info\['gt_names'\]:3D 包围框的类别,一个 1xN 数组。
+ - info\['gt_velocity'\]:3D 包围框的速度(由于不准确,没有垂直测量),一个 Nx2 数组。
+ - info\['num_lidar_pts'\]:每个 3D 包围框中包含的激光雷达点数。
+ - info\['num_radar_pts'\]:每个 3D 包围框中包含的雷达点数。
+ - info\['valid_flag'\]:每个包围框是否有效。一般情况下,我们只将包含至少一个激光雷达或雷达点的 3D 框作为有效框。
- `nuscenes_infos_train_mono3d.coco.json`:训练数据集 coco 风格的信息。该文件将基于图像的数据组织为三类(键值):`'categories'`, `'images'`, `'annotations'`。
- - info['categories']:包含所有类别名称的列表。每个元素都遵循字典格式并由两个键值组成:`'id'` 和 `'name'`。
- - info['images']:包含所有图像信息的列表。
- - info['images'][i]['file_name']:第 i 张图像的文件名。
- - info['images'][i]['id']:第 i 张图像的样本数据标记。
- - info['images'][i]['token']:与该帧对应的样本标记。
- - info['images'][i]['cam2ego_rotation']:从相机到自车的旋转(四元数格式的 1x4 列表)。
- - info['images'][i]['cam2ego_translation']:从相机到自车的转换(1x3 列表)。
- - info['images'][i]['ego2global_rotation'']:从自车到全局坐标的旋转(四元数格式的 1x4 列表)。
- - info['images'][i]['ego2global_translation']:从自车到全局坐标的转换(1x3 列表)。
- - info['images'][i]['cam_intrinsic']: 相机内参矩阵(3x3 列表)。
- - info['images'][i]['width']:图片宽度, nuScenes 中默认为 1600。
- - info['images'][i]['height']:图像高度, nuScenes 中默认为 900。
- - info['annotations']: 包含所有标注信息的列表。
- - info['annotations'][i]['file_name']:对应图像的文件名。
- - info['annotations'][i]['image_id']:对应图像的图像 ID (标记)。
- - info['annotations'][i]['area']:2D 包围框的面积。
- - info['annotations'][i]['category_name']:类别名称。
- - info['annotations'][i]['category_id']:类别 id。
- - info['annotations'][i]['bbox']:2D 包围框标注(3D 投影框的外部矩形),1x4 列表跟随 [x1, y1, x2-x1, y2-y1]。x1/y1 是沿图像水平/垂直方向的最小坐标。
- - info['annotations'][i]['iscrowd']:该区域是否拥挤。默认为 0。
- - info['annotations'][i]['bbox_cam3d']:3D 包围框(重力)中心位置(3)、大小(3)、(全局)偏航角(1)、1x7 列表。
- - info['annotations'][i]['velo_cam3d']:3D 包围框的速度(由于不准确,没有垂直测量),一个 Nx2 数组。
- - info['annotations'][i]['center2d']:包含 2.5D 信息的投影 3D 中心:图像上的投影中心位置(2)和深度(1),1x3 列表。
- - info['annotations'][i]['attribute_name']:属性名称。
- - info['annotations'][i]['attribute_id']:属性 ID。
- 我们为属性分类维护了一个属性集合和映射。更多的细节请参考[这里](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/nuscenes_mono_dataset.py#L53)。
- - info['annotations'][i]['id']:标注 ID。默认为 `i`。
+ - info\['categories'\]:包含所有类别名称的列表。每个元素都遵循字典格式并由两个键值组成:`'id'` 和 `'name'`。
+ - info\['images'\]:包含所有图像信息的列表。
+ - info\['images'\]\[i\]\['file_name'\]:第 i 张图像的文件名。
+ - info\['images'\]\[i\]\['id'\]:第 i 张图像的样本数据标记。
+ - info\['images'\]\[i\]\['token'\]:与该帧对应的样本标记。
+ - info\['images'\]\[i\]\['cam2ego_rotation'\]:从相机到自车的旋转(四元数格式的 1x4 列表)。
+ - info\['images'\]\[i\]\['cam2ego_translation'\]:从相机到自车的转换(1x3 列表)。
+ - info\['images'\]\[i\]\['ego2global_rotation''\]:从自车到全局坐标的旋转(四元数格式的 1x4 列表)。
+ - info\['images'\]\[i\]\['ego2global_translation'\]:从自车到全局坐标的转换(1x3 列表)。
+ - info\['images'\]\[i\]\['cam_intrinsic'\]: 相机内参矩阵(3x3 列表)。
+ - info\['images'\]\[i\]\['width'\]:图片宽度, nuScenes 中默认为 1600。
+ - info\['images'\]\[i\]\['height'\]:图像高度, nuScenes 中默认为 900。
+ - info\['annotations'\]: 包含所有标注信息的列表。
+ - info\['annotations'\]\[i\]\['file_name'\]:对应图像的文件名。
+ - info\['annotations'\]\[i\]\['image_id'\]:对应图像的图像 ID (标记)。
+ - info\['annotations'\]\[i\]\['area'\]:2D 包围框的面积。
+ - info\['annotations'\]\[i\]\['category_name'\]:类别名称。
+ - info\['annotations'\]\[i\]\['category_id'\]:类别 id。
+ - info\['annotations'\]\[i\]\['bbox'\]:2D 包围框标注(3D 投影框的外部矩形),1x4 列表跟随 \[x1, y1, x2-x1, y2-y1\]。x1/y1 是沿图像水平/垂直方向的最小坐标。
+ - info\['annotations'\]\[i\]\['iscrowd'\]:该区域是否拥挤。默认为 0。
+ - info\['annotations'\]\[i\]\['bbox_cam3d'\]:3D 包围框(重力)中心位置(3)、大小(3)、(全局)偏航角(1)、1x7 列表。
+ - info\['annotations'\]\[i\]\['velo_cam3d'\]:3D 包围框的速度(由于不准确,没有垂直测量),一个 Nx2 数组。
+ - info\['annotations'\]\[i\]\['center2d'\]:包含 2.5D 信息的投影 3D 中心:图像上的投影中心位置(2)和深度(1),1x3 列表。
+ - info\['annotations'\]\[i\]\['attribute_name'\]:属性名称。
+ - info\['annotations'\]\[i\]\['attribute_id'\]:属性 ID。
+ 我们为属性分类维护了一个属性集合和映射。更多的细节请参考[这里](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/nuscenes_mono_dataset.py#L53)。
+ - info\['annotations'\]\[i\]\['id'\]:标注 ID。默认为 `i`。
这里我们只解释训练信息文件中记录的数据。这同样适用于验证和测试集。
获取 `nuscenes_infos_xxx.pkl` 和 `nuscenes_infos_xxx_mono3d.coco.json` 的核心函数分别为 [\_fill_trainval_infos](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/nuscenes_converter.py#L143) 和 [get_2d_boxes](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/nuscenes_converter.py#L397)。更多细节请参考 [nuscenes_converter.py](https://github.com/open-mmlab/mmdetection3d/blob/master/tools/data_converter/nuscenes_converter.py)。
@@ -191,10 +191,11 @@ train_pipeline = [
```
它遵循 2D 检测的一般流水线,但在一些细节上有所不同:
+
- 它使用单目流水线加载图像,其中包括额外的必需信息,如相机内参矩阵。
- 它需要加载 3D 标注。
- 一些数据增强技术需要调整,例如`RandomFlip3D`。
-目前我们不支持更多的增强方法,因为如何迁移和应用其他技术仍在探索中。
+ 目前我们不支持更多的增强方法,因为如何迁移和应用其他技术仍在探索中。
## 评估
diff --git a/docs/zh_cn/datasets/s3dis_sem_seg.md b/docs/zh_cn/datasets/s3dis_sem_seg.md
index f90206ca6a..86adb02d96 100644
--- a/docs/zh_cn/datasets/s3dis_sem_seg.md
+++ b/docs/zh_cn/datasets/s3dis_sem_seg.md
@@ -39,10 +39,12 @@ mmdetection3d
例如,在 `Area_1/office_1` 目录下的文件如下所示:
- `office_1.txt`:一个 txt 文件存储着原始点云数据每个点的坐标和颜色信息。
+
- `Annotations/`:这个文件夹里包含有此房间中实例物体的信息 (以 txt 文件的形式存储)。每个 txt 文件表示一个实例,例如:
- - `chair_1.txt`:存储有该房间中一把椅子的点云数据。
- 如果我们将 `Annotations/` 下的所有 txt 文件合并起来,得到的点云就和 `office_1.txt` 中的点云是一致的。
+ - `chair_1.txt`:存储有该房间中一把椅子的点云数据。
+
+ 如果我们将 `Annotations/` 下的所有 txt 文件合并起来,得到的点云就和 `office_1.txt` 中的点云是一致的。
你可以通过 `python collect_indoor3d_data.py` 指令进行 S3DIS 数据的提取。
主要步骤包括:
@@ -143,16 +145,16 @@ s3dis
```
- `points/xxxxx.bin`:提取的点云数据。
-- `instance_mask/xxxxx.bin`:每个点云的实例标签,取值范围为 [0, ${实例个数}],其中 0 代表未标注的点。
-- `semantic_mask/xxxxx.bin`:每个点云的语义标签,取值范围为 [0, 12]。
+- `instance_mask/xxxxx.bin`:每个点云的实例标签,取值范围为 \[0, ${实例个数}\],其中 0 代表未标注的点。
+- `semantic_mask/xxxxx.bin`:每个点云的语义标签,取值范围为 \[0, 12\]。
- `s3dis_infos_Area_1.pkl`:区域 1 的数据信息,每个房间的详细信息如下:
- - info['point_cloud']: {'num_features': 6, 'lidar_idx': sample_idx}.
- - info['pts_path']: `points/xxxxx.bin` 点云的路径。
- - info['pts_instance_mask_path']: `instance_mask/xxxxx.bin` 实例标签的路径。
- - info['pts_semantic_mask_path']: `semantic_mask/xxxxx.bin` 语义标签的路径。
+ - info\['point_cloud'\]: {'num_features': 6, 'lidar_idx': sample_idx}.
+ - info\['pts_path'\]: `points/xxxxx.bin` 点云的路径。
+ - info\['pts_instance_mask_path'\]: `instance_mask/xxxxx.bin` 实例标签的路径。
+ - info\['pts_semantic_mask_path'\]: `semantic_mask/xxxxx.bin` 语义标签的路径。
- `seg_info`:为支持语义分割任务所生成的信息文件。
- - `Area_1_label_weight.npy`:每一语义类别的权重系数。因为 S3DIS 中属于不同类的点的数量相差很大,一个常见的操作是在计算损失时对不同类别进行加权 (label re-weighting) 以得到更好的分割性能。
- - `Area_1_resampled_scene_idxs.npy`:每一个场景 (房间) 的重采样标签。在训练过程中,我们依据每个场景的点的数量,会对其进行不同次数的重采样,以保证训练数据均衡。
+ - `Area_1_label_weight.npy`:每一语义类别的权重系数。因为 S3DIS 中属于不同类的点的数量相差很大,一个常见的操作是在计算损失时对不同类别进行加权 (label re-weighting) 以得到更好的分割性能。
+ - `Area_1_resampled_scene_idxs.npy`:每一个场景 (房间) 的重采样标签。在训练过程中,我们依据每个场景的点的数量,会对其进行不同次数的重采样,以保证训练数据均衡。
## 训练流程
@@ -205,13 +207,13 @@ train_pipeline = [
]
```
-- `PointSegClassMapping`:在训练过程中,只有被使用的类别的序号会被映射到类似 [0, 13) 范围内的类别标签。其余的类别序号会被转换为 `ignore_index` 所制定的忽略标签,在本例中是 `13`。
+- `PointSegClassMapping`:在训练过程中,只有被使用的类别的序号会被映射到类似 \[0, 13) 范围内的类别标签。其余的类别序号会被转换为 `ignore_index` 所制定的忽略标签,在本例中是 `13`。
- `IndoorPatchPointSample`:从输入点云中裁剪一个含有固定数量点的小块 (patch)。`block_size` 指定了裁剪块的边长,在 S3DIS 上这个数值一般设置为 `1.0`。
- `NormalizePointsColor`:将输入点的颜色信息归一化,通过将 RGB 值除以 `255` 来实现。
- 数据增广:
- - `GlobalRotScaleTrans`:对输入点云进行随机旋转和放缩变换。
- - `RandomJitterPoints`:通过对每一个点施加不同的噪声向量以实现对点云的随机扰动。
- - `RandomDropPointsColor`:以 `drop_ratio` 的概率随机将点云的颜色值全部置零。
+ - `GlobalRotScaleTrans`:对输入点云进行随机旋转和放缩变换。
+ - `RandomJitterPoints`:通过对每一个点施加不同的噪声向量以实现对点云的随机扰动。
+ - `RandomDropPointsColor`:以 `drop_ratio` 的概率随机将点云的颜色值全部置零。
## 度量指标
diff --git a/docs/zh_cn/datasets/scannet_det.md b/docs/zh_cn/datasets/scannet_det.md
index 0052bded88..cf5c5aed00 100644
--- a/docs/zh_cn/datasets/scannet_det.md
+++ b/docs/zh_cn/datasets/scannet_det.md
@@ -223,25 +223,25 @@ scannet
```
- `points/xxxxx.bin`:下采样后,未与坐标轴平行(即没有对齐)的点云。因为 ScanNet 3D 检测任务将与坐标轴平行的点云作为输入,而 ScanNet 3D 语义分割任务将对齐前的点云作为输入,我们选择存储对齐前的点云和它们的对齐矩阵。请注意:在 3D 检测的预处理流程 [`GlobalAlignment`](https://github.com/open-mmlab/mmdetection3d/blob/9f0b01caf6aefed861ef4c3eb197c09362d26b32/mmdet3d/datasets/pipelines/transforms_3d.py#L423) 后,点云就都是与坐标轴平行的了。
-- `instance_mask/xxxxx.bin`:每个点的实例标签,值的范围为:[0, NUM_INSTANCES],其中 0 表示没有标注。
-- `semantic_mask/xxxxx.bin`:每个点的语义标签,值的范围为:[1, 40], 也就是 `nyu40id` 的标准。请注意:在训练流程 `PointSegClassMapping` 中,`nyu40id` 的 ID 会被映射到训练 ID。
+- `instance_mask/xxxxx.bin`:每个点的实例标签,值的范围为:\[0, NUM_INSTANCES\],其中 0 表示没有标注。
+- `semantic_mask/xxxxx.bin`:每个点的语义标签,值的范围为:\[1, 40\], 也就是 `nyu40id` 的标准。请注意:在训练流程 `PointSegClassMapping` 中,`nyu40id` 的 ID 会被映射到训练 ID。
- `posed_images/scenexxxx_xx`:`.jpg` 图像的集合,还包含 `.txt` 格式的 4x4 相机姿态和单个 `.txt` 格式的相机内参矩阵文件。
- `scannet_infos_train.pkl`:训练集的数据信息,每个场景的具体信息如下:
- - info['point_cloud']:`{'num_features': 6, 'lidar_idx': sample_idx}`,其中 `sample_idx` 为该场景的索引。
- - info['pts_path']:`points/xxxxx.bin` 的路径。
- - info['pts_instance_mask_path']:`instance_mask/xxxxx.bin` 的路径。
- - info['pts_semantic_mask_path']:`semantic_mask/xxxxx.bin` 的路径。
- - info['annos']:每个场景的标注。
- - annotations['gt_num']:真实物体 (ground truth) 的数量。
- - annotations['name']:所有真实物体的语义类别名称,比如 `chair`(椅子)。
- - annotations['location']:depth 坐标系下与坐标轴平行的三维包围框的重力中心 (gravity center),形状为 [K, 3],其中 K 是真实物体的数量。
- - annotations['dimensions']:depth 坐标系下与坐标轴平行的三维包围框的大小,形状为 [K, 3]。
- - annotations['gt_boxes_upright_depth']:depth 坐标系下与坐标轴平行的三维包围框 `(x, y, z, x_size, y_size, z_size, yaw)`,形状为 [K, 6]。
- - annotations['unaligned_location']:depth 坐标系下与坐标轴不平行(对齐前)的三维包围框的重力中心。
- - annotations['unaligned_dimensions']:depth 坐标系下与坐标轴不平行的三维包围框的大小。
- - annotations['unaligned_gt_boxes_upright_depth']:depth 坐标系下与坐标轴不平行的三维包围框。
- - annotations['index']:所有真实物体的索引,范围为 [0, K)。
- - annotations['class']:所有真实物体类别的标号,范围为 [0, 18),形状为 [K, ]。
+ - info\['point_cloud'\]:`{'num_features': 6, 'lidar_idx': sample_idx}`,其中 `sample_idx` 为该场景的索引。
+ - info\['pts_path'\]:`points/xxxxx.bin` 的路径。
+ - info\['pts_instance_mask_path'\]:`instance_mask/xxxxx.bin` 的路径。
+ - info\['pts_semantic_mask_path'\]:`semantic_mask/xxxxx.bin` 的路径。
+ - info\['annos'\]:每个场景的标注。
+ - annotations\['gt_num'\]:真实物体 (ground truth) 的数量。
+ - annotations\['name'\]:所有真实物体的语义类别名称,比如 `chair`(椅子)。
+ - annotations\['location'\]:depth 坐标系下与坐标轴平行的三维包围框的重力中心 (gravity center),形状为 \[K, 3\],其中 K 是真实物体的数量。
+ - annotations\['dimensions'\]:depth 坐标系下与坐标轴平行的三维包围框的大小,形状为 \[K, 3\]。
+ - annotations\['gt_boxes_upright_depth'\]:depth 坐标系下与坐标轴平行的三维包围框 `(x, y, z, x_size, y_size, z_size, yaw)`,形状为 \[K, 6\]。
+ - annotations\['unaligned_location'\]:depth 坐标系下与坐标轴不平行(对齐前)的三维包围框的重力中心。
+ - annotations\['unaligned_dimensions'\]:depth 坐标系下与坐标轴不平行的三维包围框的大小。
+ - annotations\['unaligned_gt_boxes_upright_depth'\]:depth 坐标系下与坐标轴不平行的三维包围框。
+ - annotations\['index'\]:所有真实物体的索引,范围为 \[0, K)。
+ - annotations\['class'\]:所有真实物体类别的标号,范围为 \[0, 18),形状为 \[K, \]。
- `scannet_infos_val.pkl`:验证集上的数据信息,与 `scannet_infos_train.pkl` 格式完全一致。
- `scannet_infos_test.pkl`:测试集上的数据信息,与 `scannet_infos_train.pkl` 格式几乎完全一致,除了缺少标注。
@@ -291,11 +291,11 @@ train_pipeline = [
```
- `GlobalAlignment`:输入的点云在施加了坐标轴平行的矩阵后应被转换为与坐标轴平行的形式。
-- `PointSegClassMapping`:训练中,只有合法的类别 ID 才会被映射到类别标签,比如 [0, 18)。
+- `PointSegClassMapping`:训练中,只有合法的类别 ID 才会被映射到类别标签,比如 \[0, 18)。
- 数据增强:
- - `PointSample`:下采样输入点云。
- - `RandomFlip3D`:随机左右或前后翻转点云。
- - `GlobalRotScaleTrans`: 旋转输入点云,对于 ScanNet 角度通常落入 [-5, 5] (度)的范围;并放缩输入点云,对于 ScanNet 比例通常为 1.0(即不做缩放);最后平移输入点云,对于 ScanNet 通常位移量为 0(即不做位移)。
+ - `PointSample`:下采样输入点云。
+ - `RandomFlip3D`:随机左右或前后翻转点云。
+ - `GlobalRotScaleTrans`: 旋转输入点云,对于 ScanNet 角度通常落入 \[-5, 5\] (度)的范围;并放缩输入点云,对于 ScanNet 比例通常为 1.0(即不做缩放);最后平移输入点云,对于 ScanNet 通常位移量为 0(即不做位移)。
## 评估指标
diff --git a/docs/zh_cn/datasets/scannet_sem_seg.md b/docs/zh_cn/datasets/scannet_sem_seg.md
index e5817372ff..b8c30fe076 100644
--- a/docs/zh_cn/datasets/scannet_sem_seg.md
+++ b/docs/zh_cn/datasets/scannet_sem_seg.md
@@ -10,7 +10,6 @@ ScanNet 3D 语义分割数据集的准备和 3D 检测任务的准备很相似
因为 ScanNet 测试集对 3D 语义分割任务提供在线评测的基准,我们也需要下载其测试集并置于 `scannet` 目录下。
数据预处理前的文件目录结构应如下所示:
-
```
mmdetection3d
├── mmdet3d
@@ -70,8 +69,8 @@ scannet
```
- `seg_info`:为支持语义分割任务所生成的信息文件。
- - `train_label_weight.npy`:每一语义类别的权重系数。因为 ScanNet 中属于不同类的点的数量相差很大,一个常见的操作是在计算损失时对不同类别进行加权 (label re-weighting) 以得到更好的分割性能。
- - `train_resampled_scene_idxs.npy`:每一个场景 (房间) 的重采样标签。在训练过程中,我们依据每个场景的点的数量,会对其进行不同次数的重采样,以保证训练数据均衡。
+ - `train_label_weight.npy`:每一语义类别的权重系数。因为 ScanNet 中属于不同类的点的数量相差很大,一个常见的操作是在计算损失时对不同类别进行加权 (label re-weighting) 以得到更好的分割性能。
+ - `train_resampled_scene_idxs.npy`:每一个场景 (房间) 的重采样标签。在训练过程中,我们依据每个场景的点的数量,会对其进行不同次数的重采样,以保证训练数据均衡。
## 训练流程
@@ -111,7 +110,7 @@ train_pipeline = [
]
```
-- `PointSegClassMapping`:在训练过程中,只有被使用的类别的序号会被映射到类似 [0, 20) 范围内的类别标签。其余的类别序号会被转换为 `ignore_index` 所制定的忽略标签,在本例中是 `20`。
+- `PointSegClassMapping`:在训练过程中,只有被使用的类别的序号会被映射到类似 \[0, 20) 范围内的类别标签。其余的类别序号会被转换为 `ignore_index` 所制定的忽略标签,在本例中是 `20`。
- `IndoorPatchPointSample`:从输入点云中裁剪一个含有固定数量点的小块 (patch)。`block_size` 指定了裁剪块的边长,在 ScanNet 上这个数值一般设置为 `1.5`。
- `NormalizePointsColor`:将输入点的颜色信息归一化,通过将 RGB 值除以 `255` 来实现。
diff --git a/docs/zh_cn/datasets/sunrgbd_det.md b/docs/zh_cn/datasets/sunrgbd_det.md
index c02023be1d..6fa6e35ad8 100644
--- a/docs/zh_cn/datasets/sunrgbd_det.md
+++ b/docs/zh_cn/datasets/sunrgbd_det.md
@@ -239,25 +239,24 @@ sunrgbd
- `points/0xxxxx.bin`:降采样后的点云数据。
- `sunrgbd_infos_train.pkl`:训练集数据信息(标注与元信息),每个场景所含数据信息具体如下:
- - info['point_cloud']:`{'num_features': 6, 'lidar_idx': sample_idx}`,其中 `sample_idx` 为该场景的索引。
- - info['pts_path']:`points/0xxxxx.bin` 的路径。
- - info['image']:图像路径与元信息:
- - image['image_idx']:图像索引。
- - image['image_shape']:图像张量的形状(即其尺寸)。
- - image['image_path']:图像路径。
- - info['annos']:每个场景的标注:
- - annotations['gt_num']:真实物体 (ground truth) 的数量。
- - annotations['name']:所有真实物体的语义类别名称,比如 `chair`(椅子)。
- - annotations['location']:depth 坐标系下三维包围框的重力中心 (gravity center),形状为 [K, 3],其中 K 是真实物体的数量。
- - annotations['dimensions']:depth 坐标系下三维包围框的大小,形状为 [K, 3]。
- - annotations['rotation_y']:depth 坐标系下三维包围框的旋转角,形状为 [K, ]。
- - annotations['gt_boxes_upright_depth']:depth 坐标系下三维包围框 `(x, y, z, x_size, y_size, z_size, yaw)`,形状为 [K, 7]。
- - annotations['bbox']:二维包围框 `(x, y, x_size, y_size)`,形状为 [K, 4]。
- - annotations['index']:所有真实物体的索引,范围为 [0, K)。
- - annotations['class']:所有真实物体类别的标号,范围为 [0, 10),形状为 [K, ]。
+ - info\['point_cloud'\]:`{'num_features': 6, 'lidar_idx': sample_idx}`,其中 `sample_idx` 为该场景的索引。
+ - info\['pts_path'\]:`points/0xxxxx.bin` 的路径。
+ - info\['image'\]:图像路径与元信息:
+ - image\['image_idx'\]:图像索引。
+ - image\['image_shape'\]:图像张量的形状(即其尺寸)。
+ - image\['image_path'\]:图像路径。
+ - info\['annos'\]:每个场景的标注:
+ - annotations\['gt_num'\]:真实物体 (ground truth) 的数量。
+ - annotations\['name'\]:所有真实物体的语义类别名称,比如 `chair`(椅子)。
+ - annotations\['location'\]:depth 坐标系下三维包围框的重力中心 (gravity center),形状为 \[K, 3\],其中 K 是真实物体的数量。
+ - annotations\['dimensions'\]:depth 坐标系下三维包围框的大小,形状为 \[K, 3\]。
+ - annotations\['rotation_y'\]:depth 坐标系下三维包围框的旋转角,形状为 \[K, \]。
+ - annotations\['gt_boxes_upright_depth'\]:depth 坐标系下三维包围框 `(x, y, z, x_size, y_size, z_size, yaw)`,形状为 \[K, 7\]。
+ - annotations\['bbox'\]:二维包围框 `(x, y, x_size, y_size)`,形状为 \[K, 4\]。
+ - annotations\['index'\]:所有真实物体的索引,范围为 \[0, K)。
+ - annotations\['class'\]:所有真实物体类别的标号,范围为 \[0, 10),形状为 \[K, \]。
- `sunrgbd_infos_val.pkl`:验证集上的数据信息,与 `sunrgbd_infos_train.pkl` 格式完全一致。
-
## 训练流程
SUN RGB-D 上纯点云 3D 物体检测的典型流程如下:
@@ -288,8 +287,9 @@ train_pipeline = [
```
点云上的数据增强
+
- `RandomFlip3D`:随机左右或前后翻转输入点云。
-- `GlobalRotScaleTrans`:旋转输入点云,对于 SUN RGB-D 角度通常落入 [-30, 30] (度)的范围;并放缩输入点云,对于 SUN RGB-D 比例通常落入 [0.85, 1.15] 的范围;最后平移输入点云,对于 SUN RGB-D 通常位移量为 0(即不做位移)。
+- `GlobalRotScaleTrans`:旋转输入点云,对于 SUN RGB-D 角度通常落入 \[-30, 30\] (度)的范围;并放缩输入点云,对于 SUN RGB-D 比例通常落入 \[0.85, 1.15\] 的范围;最后平移输入点云,对于 SUN RGB-D 通常位移量为 0(即不做位移)。
- `PointSample`:降采样输入点云。
SUN RGB-D 上多模态(点云和图像)3D 物体检测的典型流程如下:
@@ -331,6 +331,7 @@ train_pipeline = [
```
图像上的数据增强/归一化
+
- `Resize`: 改变输入图像的大小, `keep_ratio=True` 意味着图像的比例不改变。
- `Normalize`: 归一化图像的 RGB 通道。
- `RandomFlip`: 随机地翻折图像。
diff --git a/docs/zh_cn/datasets/waymo_det.md b/docs/zh_cn/datasets/waymo_det.md
index acadf9e1d7..2c0ff7d0ad 100644
--- a/docs/zh_cn/datasets/waymo_det.md
+++ b/docs/zh_cn/datasets/waymo_det.md
@@ -103,36 +103,36 @@ mmdetection3d
为了在 Waymo 数据集上进行检测性能评估,请按照[此处指示](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/)构建用于计算评估指标的二进制文件 `compute_detection_metrics_main`,并将它置于 `mmdet3d/core/evaluation/waymo_utils/` 下。您基本上可以按照下方命令安装 `bazel`,然后构建二进制文件:
- ```shell
- # download the code and enter the base directory
- git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od
- cd waymo-od
- git checkout remotes/origin/master
-
- # use the Bazel build system
- sudo apt-get install --assume-yes pkg-config zip g++ zlib1g-dev unzip python3 python3-pip
- BAZEL_VERSION=3.1.0
- wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
- sudo bash bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
- sudo apt install build-essential
-
- # configure .bazelrc
- ./configure.sh
- # delete previous bazel outputs and reset internal caches
- bazel clean
-
- bazel build waymo_open_dataset/metrics/tools/compute_detection_metrics_main
- cp bazel-bin/waymo_open_dataset/metrics/tools/compute_detection_metrics_main ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- ```
+```shell
+# download the code and enter the base directory
+git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od
+cd waymo-od
+git checkout remotes/origin/master
+
+# use the Bazel build system
+sudo apt-get install --assume-yes pkg-config zip g++ zlib1g-dev unzip python3 python3-pip
+BAZEL_VERSION=3.1.0
+wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
+sudo bash bazel-${BAZEL_VERSION}-installer-linux-x86_64.sh
+sudo apt install build-essential
+
+# configure .bazelrc
+./configure.sh
+# delete previous bazel outputs and reset internal caches
+bazel clean
+
+bazel build waymo_open_dataset/metrics/tools/compute_detection_metrics_main
+cp bazel-bin/waymo_open_dataset/metrics/tools/compute_detection_metrics_main ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+```
接下来,您就可以在 Waymo 上评估您的模型了。如下示例是使用 8 个图形处理器 (GPU) 在 Waymo 上用 Waymo 评价指标评估 PointPillars 模型的情景:
- ```shell
- ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
- checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
- --eval waymo --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
- 'submission_prefix=results/waymo-car/kitti_results'
- ```
+```shell
+./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
+ checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
+ --eval waymo --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
+ 'submission_prefix=results/waymo-car/kitti_results'
+```
如果需要生成 bin 文件,应在 `--eval-options` 中给出 `pklfile_prefix`。对于评价指标, `waymo` 是我们推荐的官方评估原型。目前,`kitti` 这一评估选项是从 KITTI 迁移而来的,且每个难度下的评估结果和 KITTI 数据集中定义得到的不尽相同——目前大多数物体被标记为难度 0(日后会修复)。`kitti` 评估选项的不稳定来源于很大的计算量,转换的数据中遮挡 (occlusion) 和截断 (truncation) 的缺失,难度的不同定义方式,以及不同的平均精度 (Average Precision) 计算方式。
@@ -148,28 +148,28 @@ mmdetection3d
如下是一个使用 8 个图形处理器在 Waymo 上测试 PointPillars,生成 bin 文件并提交结果到官方榜单的例子:
- ```shell
- ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
- checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
- --format-only --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
- 'submission_prefix=results/waymo-car/kitti_results'
- ```
+```shell
+./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/pointpillars/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car.py \
+ checkpoints/hv_pointpillars_secfpn_sbn-2x16_2x_waymo-3d-car_latest.pth --out results/waymo-car/results_eval.pkl \
+ --format-only --eval-options 'pklfile_prefix=results/waymo-car/kitti_results' \
+ 'submission_prefix=results/waymo-car/kitti_results'
+```
在生成 bin 文件后,您可以简单地构建二进制文件 `create_submission`,并按照[指示](https://github.com/waymo-research/waymo-open-dataset/blob/master/docs/quick_start.md/) 创建一个提交文件。下面是一些示例:
- ```shell
- cd ../waymo-od/
- bazel build waymo_open_dataset/metrics/tools/create_submission
- cp bazel-bin/waymo_open_dataset/metrics/tools/create_submission ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- vim waymo_open_dataset/metrics/tools/submission.txtpb # set the metadata information
- cp waymo_open_dataset/metrics/tools/submission.txtpb ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+```shell
+cd ../waymo-od/
+bazel build waymo_open_dataset/metrics/tools/create_submission
+cp bazel-bin/waymo_open_dataset/metrics/tools/create_submission ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
+vim waymo_open_dataset/metrics/tools/submission.txtpb # set the metadata information
+cp waymo_open_dataset/metrics/tools/submission.txtpb ../mmdetection3d/mmdet3d/core/evaluation/waymo_utils/
- cd ../mmdetection3d
- # suppose the result bin is in `results/waymo-car/submission`
- mmdet3d/core/evaluation/waymo_utils/create_submission --input_filenames='results/waymo-car/kitti_results_test.bin' --output_filename='results/waymo-car/submission/model' --submission_filename='mmdet3d/core/evaluation/waymo_utils/submission.txtpb'
+cd ../mmdetection3d
+# suppose the result bin is in `results/waymo-car/submission`
+mmdet3d/core/evaluation/waymo_utils/create_submission --input_filenames='results/waymo-car/kitti_results_test.bin' --output_filename='results/waymo-car/submission/model' --submission_filename='mmdet3d/core/evaluation/waymo_utils/submission.txtpb'
- tar cvf results/waymo-car/submission/my_model.tar results/waymo-car/submission/my_model/
- gzip results/waymo-car/submission/my_model.tar
- ```
+tar cvf results/waymo-car/submission/my_model.tar results/waymo-car/submission/my_model/
+gzip results/waymo-car/submission/my_model.tar
+```
如果想用官方评估服务器评估您在验证集上的结果,您可以使用同样的方法生成提交文件,只需确保您在运行如上指令前更改 `submission.txtpb` 中的字段值即可。
diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md
index 19a116e15f..44a96d2418 100644
--- a/docs/zh_cn/faq.md
+++ b/docs/zh_cn/faq.md
@@ -6,7 +6,7 @@
- 如果您在 `import open3d` 时遇到下面的问题:
- ``OSError: /lib/x86_64-linux-gnu/libm.so.6: version 'GLIBC_2.27' not found``
+ `OSError: /lib/x86_64-linux-gnu/libm.so.6: version 'GLIBC_2.27' not found`
请将 open3d 的版本降级至 0.9.0.0,因为最新版 open3d 需要 'GLIBC_2.27' 文件的支持, Ubuntu 16.04 系统中缺失该文件,且该文件仅存在于 Ubuntu 18.04 及之后的系统中。
@@ -21,15 +21,15 @@
- 如果您在导入 pycocotools 相关包时遇到下面的问题:
- ``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject``
+ `ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`
- 请将 pycocotools 的版本降级至 2.0.1,这是由于最新版本的 pycocotools 与 numpy < 1.20.0 不兼容。或者通过下面的方式从源码进行编译来安装最新版本的 pycocotools :
+ 请将 pycocotools 的版本降级至 2.0.1,这是由于最新版本的 pycocotools 与 numpy \< 1.20.0 不兼容。或者通过下面的方式从源码进行编译来安装最新版本的 pycocotools :
- ``pip install -e "git+https://github.com/cocodataset/cocoapi#egg=pycocotools&subdirectory=PythonAPI"``
+ `pip install -e "git+https://github.com/cocodataset/cocoapi#egg=pycocotools&subdirectory=PythonAPI"`
或者
- ``pip install -e "git+https://github.com/ppwwyyxx/cocoapi#egg=pycocotools&subdirectory=PythonAPI"``
+ `pip install -e "git+https://github.com/ppwwyyxx/cocoapi#egg=pycocotools&subdirectory=PythonAPI"`
## 如何标注点云?
diff --git a/docs/zh_cn/getting_started.md b/docs/zh_cn/getting_started.md
index 75b106564b..f41a435429 100644
--- a/docs/zh_cn/getting_started.md
+++ b/docs/zh_cn/getting_started.md
@@ -7,30 +7,30 @@
- GCC 5+
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation)
-| MMDetection3D 版本 | MMDetection 版本 | MMSegmentation 版本 | MMCV 版本 |
-|:-------------------:|:-------------------:|:-------------------:|:-------------------:|
-| master | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.7.0|
-| v1.0.0rc2 | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.7.0|
-| v1.0.0rc1 | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.4.8, <=1.5.0|
-| v1.0.0rc0 | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0|
-| 0.18.1 | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0|
-| 0.18.0 | mmdet>=2.19.0, <=3.0.0| mmseg>=0.20.0, <=1.0.0 | mmcv-full>=1.3.17, <=1.5.0|
-| 0.17.3 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.17.2 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.17.1 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.17.0 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.16.0 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.15.0 | mmdet>=2.14.0, <=3.0.0| mmseg>=0.14.1, <=1.0.0 | mmcv-full>=1.3.8, <=1.4.0|
-| 0.14.0 | mmdet>=2.10.0, <=2.11.0| mmseg==0.14.0 | mmcv-full>=1.3.1, <=1.4.0|
-| 0.13.0 | mmdet>=2.10.0, <=2.11.0| Not required | mmcv-full>=1.2.4, <=1.4.0|
-| 0.12.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.4.0|
-| 0.11.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0|
-| 0.10.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0|
-| 0.9.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.2.4, <=1.3.0|
-| 0.8.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.1.5, <=1.3.0|
-| 0.7.0 | mmdet>=2.5.0, <=2.11.0 | Not required | mmcv-full>=1.1.5, <=1.3.0|
-| 0.6.0 | mmdet>=2.4.0, <=2.11.0 | Not required | mmcv-full>=1.1.3, <=1.2.0|
-| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5|
+| MMDetection3D 版本 | MMDetection 版本 | MMSegmentation 版本 | MMCV 版本 |
+| :--------------: | :----------------------: | :---------------------: | :-------------------------: |
+| master | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.7.0 |
+| v1.0.0rc2 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.7.0 |
+| v1.0.0rc1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.4.8, \<=1.5.0 |
+| v1.0.0rc0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.18.1 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.18.0 | mmdet>=2.19.0, \<=3.0.0 | mmseg>=0.20.0, \<=1.0.0 | mmcv-full>=1.3.17, \<=1.5.0 |
+| 0.17.3 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.2 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.1 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.17.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.16.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.15.0 | mmdet>=2.14.0, \<=3.0.0 | mmseg>=0.14.1, \<=1.0.0 | mmcv-full>=1.3.8, \<=1.4.0 |
+| 0.14.0 | mmdet>=2.10.0, \<=2.11.0 | mmseg==0.14.0 | mmcv-full>=1.3.1, \<=1.4.0 |
+| 0.13.0 | mmdet>=2.10.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
+| 0.12.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.4.0 |
+| 0.11.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.10.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.9.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.2.4, \<=1.3.0 |
+| 0.8.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
+| 0.7.0 | mmdet>=2.5.0, \<=2.11.0 | Not required | mmcv-full>=1.1.5, \<=1.3.0 |
+| 0.6.0 | mmdet>=2.4.0, \<=2.11.0 | Not required | mmcv-full>=1.1.3, \<=1.2.0 |
+| 0.5.0 | 2.3.0 | Not required | mmcv-full==1.0.5 |
# 安装
@@ -91,6 +91,7 @@ conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2 -c pytorch
```shell
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
```
+
需要把命令行中的 `{cu_version}` 和 `{torch_version}` 替换成对应的版本。例如:在 CUDA 11 和 PyTorch 1.7.0 的环境下,可以使用下面命令安装最新版本的 MMCV:
```shell
@@ -140,6 +141,7 @@ pip install -v -e . # or "python setup.py develop"
```shell
pip install mmsegmentation
```
+
同时,如果你想修改这部分的代码,也可以通过以下命令从源码编译 MMSegmentation:
```shell
@@ -156,8 +158,6 @@ git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
```
-
-
**g. 安装依赖包和 MMDetection3D.**
```shell
@@ -168,19 +168,19 @@ pip install -v -e . # or "python setup.py develop"
1. Git 的 commit id 在步骤 d 将会被写入到版本号当中,例 0.6.0+2e7045c 。版本号将保存在训练的模型里。推荐在每一次执行步骤 d 时,从 github 上获取最新的更新。如果基于 C++/CUDA 的代码被修改了,请执行以下步骤;
- > 重要: 如果你重装了不同版本的 CUDA 或者 PyTorch 的 mmdet,请务必移除 `./build` 文件。
+ > 重要: 如果你重装了不同版本的 CUDA 或者 PyTorch 的 mmdet,请务必移除 `./build` 文件。
- ```shell
- pip uninstall mmdet3d
- rm -rf ./build
- find . -name "*.so" | xargs rm
- ```
+ ```shell
+ pip uninstall mmdet3d
+ rm -rf ./build
+ find . -name "*.so" | xargs rm
+ ```
2. 按照上述说明,MMDetection3D 安装在 `dev` 模式下,因此在本地对代码做的任何修改都会生效,无需重新安装;
3. 如果希望使用 `opencv-python-headless` 而不是 `opencv-python`, 可以在安装 MMCV 之前安装;
-4. 一些安装依赖是可以选择的。例如只需要安装最低运行要求的版本,则可以使用 `pip install -v -e .` 命令。如果希望使用可选择的像 `albumentations` 和 `imagecorruptions` 这种依赖项,可以使用 `pip install -r requirements/optional.txt ` 进行手动安装,或者在使用 `pip` 时指定所需的附加功能(例如 `pip install -v -e .[optional]`),支持附加功能的有效键值包括 `all`、`tests`、`build` 以及 `optional` 。
+4. 一些安装依赖是可以选择的。例如只需要安装最低运行要求的版本,则可以使用 `pip install -v -e .` 命令。如果希望使用可选择的像 `albumentations` 和 `imagecorruptions` 这种依赖项,可以使用 `pip install -r requirements/optional.txt ` 进行手动安装,或者在使用 `pip` 时指定所需的附加功能(例如 `pip install -v -e .[optional]`),支持附加功能的有效键值包括 `all`、`tests`、`build` 以及 `optional` 。
5. 我们的代码目前不能在只有 CPU 的环境(CUDA 不可用)下编译运行。
@@ -253,7 +253,7 @@ python demo/pcd_demo.py demo/data/kitti/kitti_000008.bin configs/second/hv_secon
如果你想输入一个 `ply` 格式的文件,你可以使用如下函数将它转换为 `bin` 的文件格式。然后就可以使用转化成 `bin` 格式的文件去运行样例程序。
-请注意在使用此脚本前,你需要先安装 `pandas` 和 `plyfile`。 这个函数也可使用在数据预处理当中,为了能够直接训练 ```ply data```。
+请注意在使用此脚本前,你需要先安装 `pandas` 和 `plyfile`。 这个函数也可使用在数据预处理当中,为了能够直接训练 `ply data`。
```python
import numpy as np
diff --git a/docs/zh_cn/tutorials/backends_support.md b/docs/zh_cn/tutorials/backends_support.md
index 9690dbca9c..bdcaf15980 100644
--- a/docs/zh_cn/tutorials/backends_support.md
+++ b/docs/zh_cn/tutorials/backends_support.md
@@ -1,4 +1,5 @@
# 教程 7: 后端支持
+
我们支持不同的文件客户端后端:磁盘、Ceph 和 LMDB 等。下面是修改配置使之从 Ceph 加载和保存数据的示例。
## 从 Ceph 读取数据和标注文件
@@ -93,9 +94,8 @@ data = dict(
test=dict(pipeline=test_pipeline, classes=class_names, file_client_args=file_client_args))
```
-
-
## 从 Ceph 读取预训练模型
+
```python
model = dict(
pts_backbone=dict(
@@ -108,6 +108,7 @@ model = dict(
```
## 从 Ceph 读取模型权重文件
+
```python
# replace the path with your checkpoint path on Ceph
load_from = 's3://openmmlab/checkpoints/mmdetection3d/v0.1.0_models/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car/hv_pointpillars_secfpn_6x8_160e_kitti-3d-car_20200620_230614-77663cd6.pth.pth'
@@ -131,6 +132,7 @@ evaluation = dict(interval=1, save_best='bbox', out_dir='s3://openmmlab/mmdetect
```
## 训练日志保存至 Ceph
+
训练后的训练日志会备份到指定的 Ceph 路径。
```python
@@ -140,6 +142,7 @@ log_config = dict(
dict(type='TextLoggerHook', out_dir='s3://openmmlab/mmdetection3d'),
])
```
+
您还可以通过设置 `keep_local = False` 备份到指定的 Ceph 路径后删除本地训练日志。
```python
diff --git a/docs/zh_cn/tutorials/config.md b/docs/zh_cn/tutorials/config.md
index bc44fafad9..329dd90237 100644
--- a/docs/zh_cn/tutorials/config.md
+++ b/docs/zh_cn/tutorials/config.md
@@ -34,15 +34,15 @@
- `{backbone}`: 主干网络种类例如 `regnet-400mf`、`regnet-1.6gf` 等。
- `{neck}`:模型颈部的种类包括 `fpn`、`secfpn` 等。
- `[norm_setting]`:如无特殊声明,默认使用 `bn` (Batch Normalization),其他类型可以有 `gn` (Group Normalization)、`sbn` (Synchronized Batch Normalization) 等。
- `gn-head`/`gn-neck` 表示 GN 仅应用于网络的头部或颈部,而 `gn-all` 表示 GN 用于整个模型,例如主干网络、颈部和头部。
+ `gn-head`/`gn-neck` 表示 GN 仅应用于网络的头部或颈部,而 `gn-all` 表示 GN 用于整个模型,例如主干网络、颈部和头部。
- `[misc]`:模型中各式各样的设置/插件,例如 `strong-aug` 意味着在训练过程中使用更强的数据增广策略。
- `[batch_per_gpu x gpu]`:每个 GPU 的样本数和 GPU 数量,默认使用 `4x8`。
- `{schedule}`:训练方案,选项是 `1x`、`2x`、`20e` 等。
- `1x` 和 `2x` 分别代表训练 12 和 24 轮。
- `20e` 在级联模型中使用,表示训练 20 轮。
- 对于 `1x`/`2x`,初始学习率在第 8/16 和第 11/22 轮衰减 10 倍;对于 `20e`,初始学习率在第 16 和第 19 轮衰减 10 倍。
+ `1x` 和 `2x` 分别代表训练 12 和 24 轮。
+ `20e` 在级联模型中使用,表示训练 20 轮。
+ 对于 `1x`/`2x`,初始学习率在第 8/16 和第 11/22 轮衰减 10 倍;对于 `20e`,初始学习率在第 16 和第 19 轮衰减 10 倍。
- `{dataset}`:数据集,例如 `nus-3d`、`kitti-3d`、`lyft-3d`、`scannet-3d`、`sunrgbd-3d` 等。
- 当某一数据集存在多种设定时,我们也标记下所使用的类别数量,例如 `kitti-3d-3class` 和 `kitti-3d-car` 分别意味着在 KITTI 的所有三类上和单独车这一类上进行训练。
+ 当某一数据集存在多种设定时,我们也标记下所使用的类别数量,例如 `kitti-3d-3class` 和 `kitti-3d-car` 分别意味着在 KITTI 的所有三类上和单独车这一类上进行训练。
## 弃用的 train_cfg/test_cfg
diff --git a/docs/zh_cn/tutorials/coord_sys_tutorial.md b/docs/zh_cn/tutorials/coord_sys_tutorial.md
index 12491eb1c0..47e03d86d7 100644
--- a/docs/zh_cn/tutorials/coord_sys_tutorial.md
+++ b/docs/zh_cn/tutorials/coord_sys_tutorial.md
@@ -7,43 +7,43 @@ MMDetection3D 使用 3 种不同的坐标系。3D 目标检测领域中不同坐
尽管数据集和采集设备多种多样,但是通过总结 3D 目标检测的工作线,我们可以将坐标系大致分为三类:
- 相机坐标系 -- 大多数相机的坐标系,在该坐标系中 y 轴正方向指向地面,x 轴正方向指向右侧,z 轴正方向指向前方。
- ```
- 上 z 前
- | ^
- | /
- | /
- | /
- |/
- 左 ------ 0 ------> x 右
- |
- |
- |
- |
- v
- y 下
- ```
+ ```
+ 上 z 前
+ | ^
+ | /
+ | /
+ | /
+ |/
+ 左 ------ 0 ------> x 右
+ |
+ |
+ |
+ |
+ v
+ y 下
+ ```
- 激光雷达坐标系 -- 众多激光雷达的坐标系,在该坐标系中 z 轴负方向指向地面,x 轴正方向指向前方,y 轴正方向指向左侧。
- ```
- z 上 x 前
- ^ ^
- | /
- | /
- | /
- |/
- y 左 <------ 0 ------ 右
- ```
+ ```
+ z 上 x 前
+ ^ ^
+ | /
+ | /
+ | /
+ |/
+ y 左 <------ 0 ------ 右
+ ```
- 深度坐标系 -- VoteNet、H3DNet 等模型使用的坐标系,在该坐标系中 z 轴负方向指向地面,x 轴正方向指向右侧,y 轴正方向指向前方。
- ```
- z 上 y 前
- ^ ^
- | /
- | /
- | /
- |/
- 左 ------ 0 ------> x 右
- ```
-
-该教程中的坐标系定义实际上**不仅仅是定义三个轴**。对于形如 ``$$`(x, y, z, dx, dy, dz, r)`$$`` 的框来说,我们的坐标系也定义了如何解释框的尺寸 ``$$`(dx, dy, dz)`$$`` 和转向角 (yaw) 角度 ``$$`r`$$``。
+ ```
+ z 上 y 前
+ ^ ^
+ | /
+ | /
+ | /
+ |/
+ 左 ------ 0 ------> x 右
+ ```
+
+该教程中的坐标系定义实际上**不仅仅是定义三个轴**。对于形如 `` $$`(x, y, z, dx, dy, dz, r)`$$ `` 的框来说,我们的坐标系也定义了如何解释框的尺寸 `` $$`(dx, dy, dz)`$$ `` 和转向角 (yaw) 角度 `` $$`r`$$ ``。
三个坐标系的图示如下:
@@ -55,13 +55,13 @@ MMDetection3D 使用 3 种不同的坐标系。3D 目标检测领域中不同坐
## 转向角 (yaw) 的定义
-请参考[维基百科](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles)了解转向角的标准定义。在目标检测中,我们选择一个轴作为重力轴,并在垂直于重力轴的平面 ``$$`\Pi`$$`` 上选取一个参考方向,那么参考方向的转向角为 0,在 ``$$`\Pi`$$`` 上的其他方向有非零的转向角,其角度取决于其与参考方向的角度。
+请参考[维基百科](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles)了解转向角的标准定义。在目标检测中,我们选择一个轴作为重力轴,并在垂直于重力轴的平面 `` $$`\Pi`$$ `` 上选取一个参考方向,那么参考方向的转向角为 0,在 `` $$`\Pi`$$ `` 上的其他方向有非零的转向角,其角度取决于其与参考方向的角度。
目前,对于所有支持的数据集,标注不包括俯仰角 (pitch) 和滚动角 (roll),这意味着我们在预测框和计算框之间的重叠时只需考虑转向角 (yaw)。
在 MMDetection3D 中,所有坐标系都是右手坐标系,这意味着如果从重力轴的负方向(轴的正方向指向人眼)看,转向角 (yaw) 沿着逆时针方向增加。
-下图显示,在右手坐标系中,如果我们设定 x 轴正方向为参考方向,那么 y 轴正方向的转向角 (yaw) 为 ``$$`\frac{\pi}{2}`$$``。
+下图显示,在右手坐标系中,如果我们设定 x 轴正方向为参考方向,那么 y 轴正方向的转向角 (yaw) 为 `` $$`\frac{\pi}{2}`$$ ``。
```
z 上 y 前 (yaw=0.5*pi)
@@ -92,9 +92,9 @@ __|____|____|____|______\ x 右
## 框尺寸的定义
-框尺寸的定义与转向角 (yaw) 的定义是分不开的。在上一节中,我们提到如果一个框的转向角 (yaw) 为 0,它的方向就被定义为与 x 轴平行。那么自然地,一个框对应于 x 轴的尺寸应该是 ``$$`dx`$$``。但是,这在某些数据集中并非总是如此(我们稍后会解决这个问题)。
+框尺寸的定义与转向角 (yaw) 的定义是分不开的。在上一节中,我们提到如果一个框的转向角 (yaw) 为 0,它的方向就被定义为与 x 轴平行。那么自然地,一个框对应于 x 轴的尺寸应该是 `` $$`dx`$$ ``。但是,这在某些数据集中并非总是如此(我们稍后会解决这个问题)。
-下图展示了 x 轴和 ``$$`dx`$$``,y 轴和 ``$$`dy`$$`` 对应的含义。
+下图展示了 x 轴和 `` $$`dx`$$ ``,y 轴和 `` $$`dy`$$ `` 对应的含义。
```
y 前
@@ -111,7 +111,7 @@ __|____|____|____|______\ x 右
| dy
```
-注意框的方向总是和 ``$$`dx`$$`` 边平行。
+注意框的方向总是和 `` $$`dx`$$ `` 边平行。
```
y 前
@@ -138,12 +138,12 @@ KITTI 数据集的原始标注是在相机坐标系下的,详见 [get_label_an

-对于每个框来说,尺寸为 ``$$`(w, l, h)`$$``,转向角 (yaw) 的参考方向为 y 轴正方向。更多细节请参考[代码库](https://github.com/traveller59/second.pytorch#concepts)。
+对于每个框来说,尺寸为 `` $$`(w, l, h)`$$ ``,转向角 (yaw) 的参考方向为 y 轴正方向。更多细节请参考[代码库](https://github.com/traveller59/second.pytorch#concepts)。
我们的激光雷达坐标系有两处改变:
- 转向角 (yaw) 被定义为右手而非左手,从而保持一致性;
-- 框的尺寸为 ``$$`(l, w, h)`$$`` 而非 ``$$`(w, l, h)`$$``,由于在 KITTI 数据集中 ``$$`w`$$`` 对应 ``$$`dy`$$``,``$$`l`$$`` 对应 ``$$`dx`$$``。
+- 框的尺寸为 `` $$`(l, w, h)`$$ `` 而非 `` $$`(w, l, h)`$$ ``,由于在 KITTI 数据集中 `` $$`w`$$ `` 对应 `` $$`dy`$$ ``,`` $$`l`$$ `` 对应 `` $$`dx`$$ ``。
### Waymo
@@ -151,7 +151,7 @@ KITTI 数据集的原始标注是在相机坐标系下的,详见 [get_label_an
### NuScenes
-NuScenes 提供了一个评估工具包,其中每个框都被包装成一个 `Box` 实例。`Box` 的坐标系不同于我们的激光雷达坐标系,在 `Box` 坐标系中,前两个表示框尺寸的元素分别对应 ``$$`(dy, dx)`$$`` 或者 ``$$`(w, l)`$$``,和我们的表示方法相反。更多细节请参考 NuScenes [教程](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/zh_cn/datasets/nuscenes_det.md#notes)。
+NuScenes 提供了一个评估工具包,其中每个框都被包装成一个 `Box` 实例。`Box` 的坐标系不同于我们的激光雷达坐标系,在 `Box` 坐标系中,前两个表示框尺寸的元素分别对应 `` $$`(dy, dx)`$$ `` 或者 `` $$`(w, l)`$$ ``,和我们的表示方法相反。更多细节请参考 NuScenes [教程](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/zh_cn/datasets/nuscenes_det.md#notes)。
读者可以参考 [NuScenes 开发工具](https://github.com/nutonomy/nuscenes-devkit/tree/master/python-sdk/nuscenes/eval/detection),了解 [NuScenes 框](https://github.com/nutonomy/nuscenes-devkit/blob/2c6a752319f23910d5f55cc995abc547a9e54142/python-sdk/nuscenes/utils/data_classes.py#L457) 的定义和 [NuScenes 评估](https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/eval/detection/evaluate.py)的过程。
@@ -183,25 +183,25 @@ SUN RGB-D 的原始数据不是点云而是 RGB-D 图像。我们通过反投影
首先,对于点和框的中心点,坐标转换前后满足下列关系:
-- ``$$`x_{LiDAR}=z_{camera}`$$``
-- ``$$`y_{LiDAR}=-x_{camera}`$$``
-- ``$$`z_{LiDAR}=-y_{camera}`$$``
+- `` $$`x_{LiDAR}=z_{camera}`$$ ``
+- `` $$`y_{LiDAR}=-x_{camera}`$$ ``
+- `` $$`z_{LiDAR}=-y_{camera}`$$ ``
然后,框的尺寸转换前后满足下列关系:
-- ``$$`dx_{LiDAR}=dx_{camera}`$$``
-- ``$$`dy_{LiDAR}=dz_{camera}`$$``
-- ``$$`dz_{LiDAR}=dy_{camera}`$$``
+- `` $$`dx_{LiDAR}=dx_{camera}`$$ ``
+- `` $$`dy_{LiDAR}=dz_{camera}`$$ ``
+- `` $$`dz_{LiDAR}=dy_{camera}`$$ ``
最后,转向角 (yaw) 也应该被转换:
-- ``$$`r_{LiDAR}=-\frac{\pi}{2}-r_{camera}`$$``
+- `` $$`r_{LiDAR}=-\frac{\pi}{2}-r_{camera}`$$ ``
详见[此处](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/box_3d_mode.py)代码了解更多细节。
### 鸟瞰图
-如果 3D 框是 ``$$`(x, y, z, dx, dy, dz, r)`$$``,相机坐标系下框的鸟瞰图是 ``$$`(x, z, dx, dz, -r)`$$``。转向角 (yaw) 符号取反是因为相机坐标系重力轴的正方向指向地面。
+如果 3D 框是 `` $$`(x, y, z, dx, dy, dz, r)`$$ ``,相机坐标系下框的鸟瞰图是 `` $$`(x, z, dx, dz, -r)`$$ ``。转向角 (yaw) 符号取反是因为相机坐标系重力轴的正方向指向地面。
详见[此处](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/cam_box3d.py)代码了解更多细节。
@@ -223,18 +223,18 @@ SUN RGB-D 的原始数据不是点云而是 RGB-D 图像。我们通过反投影
否。例如在 KITTI 中,从相机坐标系转换为激光雷达坐标系时,我们需要一个校准矩阵。
-#### Q3: 框中转向角 (yaw) ``$$`2\pi`$$`` 的相位差如何影响评估?
+#### Q3: 框中转向角 (yaw) `` $$`2\pi`$$ `` 的相位差如何影响评估?
-对于交并比 (IoU) 计算,转向角 (yaw) 有 ``$$`2\pi`$$`` 的相位差的两个框是相同的,所以不会影响评估。
+对于交并比 (IoU) 计算,转向角 (yaw) 有 `` $$`2\pi`$$ `` 的相位差的两个框是相同的,所以不会影响评估。
-对于角度预测评估,例如 NuScenes 中的 NDS 指标和 KITTI 中的 AOS 指标,会先对预测框的角度进行标准化,因此 ``$$`2\pi`$$`` 的相位差不会改变结果。
+对于角度预测评估,例如 NuScenes 中的 NDS 指标和 KITTI 中的 AOS 指标,会先对预测框的角度进行标准化,因此 `` $$`2\pi`$$ `` 的相位差不会改变结果。
-#### Q4: 框中转向角 (yaw) ``$$`\pi`$$`` 的相位差如何影响评估?
+#### Q4: 框中转向角 (yaw) `` $$`\pi`$$ `` 的相位差如何影响评估?
-对于交并比 (IoU) 计算,转向角 (yaw) 有 ``$$`\pi`$$`` 的相位差的两个框是相同的,所以不会影响评估。
+对于交并比 (IoU) 计算,转向角 (yaw) 有 `` $$`\pi`$$ `` 的相位差的两个框是相同的,所以不会影响评估。
然而,对于角度预测评估,这会导致完全相反的方向。
-考虑一辆汽车,转向角 (yaw) 是汽车前部方向与 x 轴正方向之间的夹角。如果我们将该角度增加 ``$$`\pi`$$``,车前部将变成车后部。
+考虑一辆汽车,转向角 (yaw) 是汽车前部方向与 x 轴正方向之间的夹角。如果我们将该角度增加 `` $$`\pi`$$ ``,车前部将变成车后部。
-对于某些类别,例如障碍物,前后没有区别,因此 ``$$`\pi`$$`` 的相位差不会对角度预测分数产生影响。
+对于某些类别,例如障碍物,前后没有区别,因此 `` $$`\pi`$$ `` 的相位差不会对角度预测分数产生影响。
diff --git a/docs/zh_cn/tutorials/customize_dataset.md b/docs/zh_cn/tutorials/customize_dataset.md
index 478c0a30c9..e425f471d7 100644
--- a/docs/zh_cn/tutorials/customize_dataset.md
+++ b/docs/zh_cn/tutorials/customize_dataset.md
@@ -215,62 +215,62 @@ dataset_A_train = dict(
1. 如果待拼接的数据集的类别相同,标注文件的不同,此时可通过下面的方式来实现数据集的拼接:
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- pipeline=train_pipeline
- )
- ```
-
- 如果拼接数据集用于测试或者评估,那么这种拼接方式能够对每个数据集进行分开地测试或者评估,若希望对拼接数据集进行整体的测试或者评估,此时需要设置 `separate_eval=False`,如下所示:
-
- ```python
- dataset_A_train = dict(
- type='Dataset_A',
- ann_file = ['anno_file_1', 'anno_file_2'],
- separate_eval=False,
- pipeline=train_pipeline
- )
- ```
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ pipeline=train_pipeline
+ )
+ ```
+
+ 如果拼接数据集用于测试或者评估,那么这种拼接方式能够对每个数据集进行分开地测试或者评估,若希望对拼接数据集进行整体的测试或者评估,此时需要设置 `separate_eval=False`,如下所示:
+
+ ```python
+ dataset_A_train = dict(
+ type='Dataset_A',
+ ann_file = ['anno_file_1', 'anno_file_2'],
+ separate_eval=False,
+ pipeline=train_pipeline
+ )
+ ```
2. 如果待拼接的数据集完全不相同,此时可通过拼接不同数据集的配置的方式实现数据集的拼接,如下所示:
- ```python
- dataset_A_train = dict()
- dataset_B_train = dict()
-
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train = [
- dataset_A_train,
- dataset_B_train
- ],
- val = dataset_A_val,
- test = dataset_A_test
- )
- ```
+ ```python
+ dataset_A_train = dict()
+ dataset_B_train = dict()
+
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train = [
+ dataset_A_train,
+ dataset_B_train
+ ],
+ val = dataset_A_val,
+ test = dataset_A_test
+ )
+ ```
- 如果拼接数据集用于测试或者评估,那么这种拼接方式能够对每个数据集进行分开地测试或者评估。
+ 如果拼接数据集用于测试或者评估,那么这种拼接方式能够对每个数据集进行分开地测试或者评估。
3. 可以通过显示地定义 `ConcatDataset` 来实现数据集的拼接,如下所示:
- ```python
- dataset_A_val = dict()
- dataset_B_val = dict()
-
- data = dict(
- imgs_per_gpu=2,
- workers_per_gpu=2,
- train=dataset_A_train,
- val=dict(
- type='ConcatDataset',
- datasets=[dataset_A_val, dataset_B_val],
- separate_eval=False))
- ```
-
- 其中,`separate_eval=False` 表示将所有的数据集作为一个整体进行评估。
+ ```python
+ dataset_A_val = dict()
+ dataset_B_val = dict()
+
+ data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=2,
+ train=dataset_A_train,
+ val=dict(
+ type='ConcatDataset',
+ datasets=[dataset_A_val, dataset_B_val],
+ separate_eval=False))
+ ```
+
+ 其中,`separate_eval=False` 表示将所有的数据集作为一个整体进行评估。
**注意:**
diff --git a/docs/zh_cn/tutorials/customize_runtime.md b/docs/zh_cn/tutorials/customize_runtime.md
index 280ab5c44b..ac9dc98519 100644
--- a/docs/zh_cn/tutorials/customize_runtime.md
+++ b/docs/zh_cn/tutorials/customize_runtime.md
@@ -40,7 +40,7 @@ class MyOptimizer(Optimizer):
- 新建 `mmdet3d/core/optimizer/__init__.py` 文件用于引入。
- 新定义的模块应该在 `mmdet3d/core/optimizer/__init__.py` 中被引入,使得注册器可以找到新模块并注册之:
+ 新定义的模块应该在 `mmdet3d/core/optimizer/__init__.py` 中被引入,使得注册器可以找到新模块并注册之:
```python
from .my_optimizer import MyOptimizer
@@ -114,35 +114,35 @@ class MyOptimizerConstructor(object):
- __使用梯度裁剪 (gradient clip) 来稳定训练过程__:
- 一些模型依赖梯度裁剪技术来裁剪训练中的梯度,以稳定训练过程。举例如下:
+ 一些模型依赖梯度裁剪技术来裁剪训练中的梯度,以稳定训练过程。举例如下:
- ```python
- optimizer_config = dict(
- _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
- ```
+ ```python
+ optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+ ```
- 如果您的配置继承了一个已经设置了 `optimizer_config` 的基础配置,那么您可能需要 `_delete_=True` 字段来覆盖基础配置中无用的设置。详见配置文件的[说明文档](https://mmdetection.readthedocs.io/zh_CN/latest/tutorials/config.html)。
+ 如果您的配置继承了一个已经设置了 `optimizer_config` 的基础配置,那么您可能需要 `_delete_=True` 字段来覆盖基础配置中无用的设置。详见配置文件的[说明文档](https://mmdetection.readthedocs.io/zh_CN/latest/tutorials/config.html)。
- __使用动量规划器 (momentum scheduler) 来加速模型收敛__:
- 我们支持用动量规划器来根据学习率更改模型的动量,这样可以使模型更快地收敛。
- 动量规划器通常和学习率规划器一起使用,比如说,如下配置文件在 3D 检测中被用于加速模型收敛。
- 更多细节详见 [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L358) 和 [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/momentum_updater.py#L225) 的实现。
-
- ```python
- lr_config = dict(
- policy='cyclic',
- target_ratio=(10, 1e-4),
- cyclic_times=1,
- step_ratio_up=0.4,
- )
- momentum_config = dict(
- policy='cyclic',
- target_ratio=(0.85 / 0.95, 1),
- cyclic_times=1,
- step_ratio_up=0.4,
- )
- ```
+ 我们支持用动量规划器来根据学习率更改模型的动量,这样可以使模型更快地收敛。
+ 动量规划器通常和学习率规划器一起使用,比如说,如下配置文件在 3D 检测中被用于加速模型收敛。
+ 更多细节详见 [CyclicLrUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/lr_updater.py#L358) 和 [CyclicMomentumUpdater](https://github.com/open-mmlab/mmcv/blob/v1.3.7/mmcv/runner/hooks/momentum_updater.py#L225) 的实现。
+
+ ```python
+ lr_config = dict(
+ policy='cyclic',
+ target_ratio=(10, 1e-4),
+ cyclic_times=1,
+ step_ratio_up=0.4,
+ )
+ momentum_config = dict(
+ policy='cyclic',
+ target_ratio=(0.85 / 0.95, 1),
+ cyclic_times=1,
+ step_ratio_up=0.4,
+ )
+ ```
## 自定义训练规程
@@ -151,20 +151,20 @@ class MyOptimizerConstructor(object):
- 多项式衰减规程:
- ```python
- lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
- ```
+ ```python
+ lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+ ```
- 余弦退火规程:
- ```python
- lr_config = dict(
- policy='CosineAnnealing',
- warmup='linear',
- warmup_iters=1000,
- warmup_ratio=1.0 / 10,
- min_lr_ratio=1e-5)
- ```
+ ```python
+ lr_config = dict(
+ policy='CosineAnnealing',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=1.0 / 10,
+ min_lr_ratio=1e-5)
+ ```
## 自定义工作流
@@ -238,7 +238,7 @@ class MyHook(Hook):
- 更改 `mmdet3d/core/utils/__init__.py` 来引入之:
- 新定义的模块应在 `mmdet3d/core/utils/__init__.py` 中引入,以使得注册器可以找到新模块并注册之:
+ 新定义的模块应在 `mmdet3d/core/utils/__init__.py` 中引入,以使得注册器可以找到新模块并注册之:
```python
from .my_hook import MyHook
diff --git a/docs/zh_cn/tutorials/data_pipeline.md b/docs/zh_cn/tutorials/data_pipeline.md
index 8e4f940e63..a1767177ad 100644
--- a/docs/zh_cn/tutorials/data_pipeline.md
+++ b/docs/zh_cn/tutorials/data_pipeline.md
@@ -78,100 +78,113 @@ test_pipeline = [
### 数据加载
`LoadPointsFromFile`
+
- 添加:points
`LoadPointsFromMultiSweeps`
+
- 更新:points
`LoadAnnotations3D`
+
- 添加:gt_bboxes_3d, gt_labels_3d, gt_bboxes, gt_labels, pts_instance_mask, pts_semantic_mask, bbox3d_fields, pts_mask_fields, pts_seg_fields
### 预处理
`GlobalRotScaleTrans`
+
- 添加:pcd_trans, pcd_rotation, pcd_scale_factor
-- 更新:points, *bbox3d_fields
+- 更新:points, \*bbox3d_fields
`RandomFlip3D`
+
- 添加:flip, pcd_horizontal_flip, pcd_vertical_flip
-- 更新:points, *bbox3d_fields
+- 更新:points, \*bbox3d_fields
`PointsRangeFilter`
+
- 更新:points
`ObjectRangeFilter`
+
- 更新:gt_bboxes_3d, gt_labels_3d
`ObjectNameFilter`
+
- 更新:gt_bboxes_3d, gt_labels_3d
`PointShuffle`
+
- 更新:points
`PointsRangeFilter`
+
- 更新:points
### 格式化
`DefaultFormatBundle3D`
+
- 更新:points, gt_bboxes_3d, gt_labels_3d, gt_bboxes, gt_labels
`Collect3D`
+
- 添加:img_meta (由 `meta_keys` 指定的键值构成的 img_meta)
- 移除:所有除 `keys` 指定的键值以外的其他键值
### 测试时的数据增强
`MultiScaleFlipAug`
+
- 更新: scale, pcd_scale_factor, flip, flip_direction, pcd_horizontal_flip, pcd_vertical_flip (与这些指定的参数对应的增强后的数据列表)
## 扩展并使用自定义数据集预处理方法
1. 在任意文件中写入新的数据集预处理方法,如 `my_pipeline.py`,该预处理方法的输入和输出均为字典
- ```python
- from mmdet.datasets import PIPELINES
+ ```python
+ from mmdet.datasets import PIPELINES
- @PIPELINES.register_module()
- class MyTransform:
+ @PIPELINES.register_module()
+ class MyTransform:
- def __call__(self, results):
- results['dummy'] = True
- return results
- ```
+ def __call__(self, results):
+ results['dummy'] = True
+ return results
+ ```
2. 导入新的预处理方法类
- ```python
- from .my_pipeline import MyTransform
- ```
+ ```python
+ from .my_pipeline import MyTransform
+ ```
3. 在配置文件中使用该数据集预处理方法
- ```python
- train_pipeline = [
- dict(
- type='LoadPointsFromFile',
- load_dim=5,
- use_dim=5,
- file_client_args=file_client_args),
- dict(
- type='LoadPointsFromMultiSweeps',
- sweeps_num=10,
- file_client_args=file_client_args),
- dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
- dict(
- type='GlobalRotScaleTrans',
- rot_range=[-0.3925, 0.3925],
- scale_ratio_range=[0.95, 1.05],
- translation_std=[0, 0, 0]),
- dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
- dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
- dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
- dict(type='ObjectNameFilter', classes=class_names),
- dict(type='MyTransform'),
- dict(type='PointShuffle'),
- dict(type='DefaultFormatBundle3D', class_names=class_names),
- dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
- ]
- ```
+ ```python
+ train_pipeline = [
+ dict(
+ type='LoadPointsFromFile',
+ load_dim=5,
+ use_dim=5,
+ file_client_args=file_client_args),
+ dict(
+ type='LoadPointsFromMultiSweeps',
+ sweeps_num=10,
+ file_client_args=file_client_args),
+ dict(type='LoadAnnotations3D', with_bbox_3d=True, with_label_3d=True),
+ dict(
+ type='GlobalRotScaleTrans',
+ rot_range=[-0.3925, 0.3925],
+ scale_ratio_range=[0.95, 1.05],
+ translation_std=[0, 0, 0]),
+ dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
+ dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
+ dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
+ dict(type='ObjectNameFilter', classes=class_names),
+ dict(type='MyTransform'),
+ dict(type='PointShuffle'),
+ dict(type='DefaultFormatBundle3D', class_names=class_names),
+ dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
+ ]
+ ```
diff --git a/docs/zh_cn/tutorials/model_deployment.md b/docs/zh_cn/tutorials/model_deployment.md
index 4fbc2bbd72..47fdab7941 100644
--- a/docs/zh_cn/tutorials/model_deployment.md
+++ b/docs/zh_cn/tutorials/model_deployment.md
@@ -37,17 +37,17 @@ python ./tools/deploy.py \
### 参数描述
-* `deploy_cfg` : MMDeploy 代码库中用于部署的配置文件路径。
-* `model_cfg` : OpenMMLab 系列代码库中使用的模型配置文件路径。
-* `checkpoint` : OpenMMLab 系列代码库的模型文件路径。
-* `img` : 用于模型转换时使用的点云文件或图像文件路径。
-* `--test-img` : 用于测试模型的图像文件路径。如果没有指定,将设置成 `None`。
-* `--work-dir` : 工作目录,用来保存日志和模型文件。
-* `--calib-dataset-cfg` : 此参数只在 int8 模式下生效,用于校准数据集配置文件。如果没有指定,将被设置成 `None`,并使用模型配置文件中的 'val' 数据集进行校准。
-* `--device` : 用于模型转换的设备。如果没有指定,将被设置成 cpu。
-* `--log-level` : 设置日记的等级,选项包括 `'CRITICAL','FATAL','ERROR','WARN','WARNING','INFO','DEBUG','NOTSET'`。如果没有指定,将被设置成 INFO。
-* `--show` : 是否显示检测的结果。
-* `--dump-info` : 是否输出 SDK 信息。
+- `deploy_cfg` : MMDeploy 代码库中用于部署的配置文件路径。
+- `model_cfg` : OpenMMLab 系列代码库中使用的模型配置文件路径。
+- `checkpoint` : OpenMMLab 系列代码库的模型文件路径。
+- `img` : 用于模型转换时使用的点云文件或图像文件路径。
+- `--test-img` : 用于测试模型的图像文件路径。如果没有指定,将设置成 `None`。
+- `--work-dir` : 工作目录,用来保存日志和模型文件。
+- `--calib-dataset-cfg` : 此参数只在 int8 模式下生效,用于校准数据集配置文件。如果没有指定,将被设置成 `None`,并使用模型配置文件中的 'val' 数据集进行校准。
+- `--device` : 用于模型转换的设备。如果没有指定,将被设置成 cpu。
+- `--log-level` : 设置日记的等级,选项包括 `'CRITICAL','FATAL','ERROR','WARN','WARNING','INFO','DEBUG','NOTSET'`。如果没有指定,将被设置成 INFO。
+- `--show` : 是否显示检测的结果。
+- `--dump-info` : 是否输出 SDK 信息。
### 示例
@@ -110,12 +110,12 @@ python tools/test.py \
## 支持模型列表
-| Model | TorchScript | OnnxRuntime | TensorRT | NCNN | PPLNN | OpenVINO | Model config |
-| -------------------- | :---------: | :---------: | :------: | :---: | :---: | :------: | -------------------------------------------------------------------------------------- |
-| PointPillars | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
-| CenterPoint (pillar) | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
+| Model | TorchScript | OnnxRuntime | TensorRT | NCNN | PPLNN | OpenVINO | Model config |
+| -------------------- | :---------: | :---------: | :------: | :--: | :---: | :------: | -------------------------------------------------------------------------------------- |
+| PointPillars | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/pointpillars) |
+| CenterPoint (pillar) | ? | Y | Y | N | N | Y | [config](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/centerpoint) |
## 注意
-* MMDeploy 的版本需要 >= 0.4.0。
-* 目前 CenterPoint 仅支持了 pillar 版本的。
+- MMDeploy 的版本需要 >= 0.4.0。
+- 目前 CenterPoint 仅支持了 pillar 版本的。
diff --git a/docs/zh_cn/useful_tools.md b/docs/zh_cn/useful_tools.md
index 39f54da855..8acb5e7570 100644
--- a/docs/zh_cn/useful_tools.md
+++ b/docs/zh_cn/useful_tools.md
@@ -14,26 +14,26 @@ python tools/analysis_tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title
示例:
-- 绘制出某次运行的分类 loss。
+- 绘制出某次运行的分类 loss。
- ```shell
- python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
- ```
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
+ ```
-- 绘制出某次运行的分类和回归 loss,并且保存图片为 pdf 格式。
+- 绘制出某次运行的分类和回归 loss,并且保存图片为 pdf 格式。
- ```shell
- python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
- ```
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
+ ```
-- 在同一张图片中比较两次运行的 bbox mAP。
+- 在同一张图片中比较两次运行的 bbox mAP。
- ```shell
- # 根据 Car_3D_moderate_strict 在 KITTI 上评估 PartA2 和 second。
- python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/PartA2.log.json tools/logs/second.log.json --keys KITTI/Car_3D_moderate_strict --legend PartA2 second --mode eval --interval 1
- # 根据 Car_3D_moderate_strict 在 KITTI 上分别对车和 3 类评估 PointPillars。
- python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/pp-3class.log.json tools/logs/pp.log.json --keys KITTI/Car_3D_moderate_strict --legend pp-3class pp --mode eval --interval 2
- ```
+ ```shell
+ # 根据 Car_3D_moderate_strict 在 KITTI 上评估 PartA2 和 second。
+ python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/PartA2.log.json tools/logs/second.log.json --keys KITTI/Car_3D_moderate_strict --legend PartA2 second --mode eval --interval 1
+ # 根据 Car_3D_moderate_strict 在 KITTI 上分别对车和 3 类评估 PointPillars。
+ python tools/analysis_tools/analyze_logs.py plot_curve tools/logs/pp-3class.log.json tools/logs/pp.log.json --keys KITTI/Car_3D_moderate_strict --legend pp-3class pp --mode eval --interval 2
+ ```
您也能计算平均训练速度。
@@ -51,7 +51,7 @@ time std over epochs is 0.0028
average iter time: 1.1959 s/iter
```
-
+
# 可视化
@@ -80,7 +80,6 @@ python tools/test.py ${CONFIG_FILE} ${CKPT_PATH} --eval 'mAP' --eval-options 'sh
python tools/misc/visualize_results.py ${CONFIG_FILE} --result ${RESULTS_PATH} --show-dir ${SHOW_DIR}
```
-

或者您可以使用 3D 可视化软件,例如 [MeshLab](http://www.meshlab.net/) 来打开这些在 `${SHOW_DIR}` 目录下的文件,从而查看 3D 检测输出。具体来说,打开 `***_points.obj` 查看输入点云,打开 `***_pred.obj` 查看预测的 3D 边界框。这允许推理和结果生成在远程服务器中完成,用户可以使用 GUI 在他们的主机上打开它们。
@@ -127,7 +126,7 @@ python tools/misc/browse_dataset.py configs/_base_/datasets/nus-mono3d.py --task

-
+
# 模型部署
@@ -185,7 +184,7 @@ python tools/deployment/test_torchserver.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECK
python tools/deployment/test_torchserver.py demo/data/kitti/kitti_000008.bin configs/second/hv_second_secfpn_6x8_80e_kitti-3d-car.py checkpoints/hv_second_secfpn_6x8_80e_kitti-3d-car_20200620_230238-393f000c.pth second
```
-
+
# 模型复杂度
@@ -211,7 +210,7 @@ Params: 953.83 k
2. 一些运算操作不计入计算量 (FLOPs),比如说像GN和定制的运算操作,详细细节请参考 [`mmcv.cnn.get_model_complexity_info()`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py)。
3. 我们现在仅仅支持单模态输入(点云或者图片)的单阶段模型的计算量 (FLOPs) 计算,我们将会在未来支持两阶段和多模态模型的计算。
-
+
# 模型转换
@@ -253,7 +252,7 @@ python tools/model_converters/publish_model.py work_dirs/faster_rcnn/latest.pth
最终的输出文件名将会是 `faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth`。
-
+
# 数据集转换
@@ -266,15 +265,15 @@ python -u tools/data_converter/nuimage_converter.py --data-root ${DATA_ROOT} --v
--out-dir ${OUT_DIR} --nproc ${NUM_WORKERS} --extra-tag ${TAG}
```
-- `--data-root`: 数据集的根目录,默认为 `./data/nuimages`。
-- `--version`: 数据集的版本,默认为 `v1.0-mini`。要获取完整数据集,请使用 `--version v1.0-train v1.0-val v1.0-mini`。
-- `--out-dir`: 注释和语义掩码的输出目录,默认为 `./data/nuimages/annotations/`。
-- `--nproc`: 数据准备的进程数,默认为 `4`。由于图片是并行处理的,更大的进程数目能够减少准备时间。
-- `--extra-tag`: 注释的额外标签,默认为 `nuimages`。这可用于将不同时间处理的不同注释分开以供研究。
+- `--data-root`: 数据集的根目录,默认为 `./data/nuimages`。
+- `--version`: 数据集的版本,默认为 `v1.0-mini`。要获取完整数据集,请使用 `--version v1.0-train v1.0-val v1.0-mini`。
+- `--out-dir`: 注释和语义掩码的输出目录,默认为 `./data/nuimages/annotations/`。
+- `--nproc`: 数据准备的进程数,默认为 `4`。由于图片是并行处理的,更大的进程数目能够减少准备时间。
+- `--extra-tag`: 注释的额外标签,默认为 `nuimages`。这可用于将不同时间处理的不同注释分开以供研究。
更多的数据准备细节参考 [doc](https://mmdetection3d.readthedocs.io/zh_CN/latest/data_preparation.html),nuImages 数据集的细节参考 [README](https://github.com/open-mmlab/mmdetection3d/blob/master/configs/nuimages/README.md/)。
-
+
# 其他内容