Skip to content

Commit

Permalink
Edit the docs that don't reflect the V2 changes (#3225)
Browse files Browse the repository at this point in the history
* Edit the docs

* Fix mis-imported

* Fix merge conflict

* Fix incorrect information for the action tasks
  • Loading branch information
sungmanc authored Mar 28, 2024
1 parent b55d82c commit b0c8583
Show file tree
Hide file tree
Showing 21 changed files with 329 additions and 186 deletions.
3 changes: 0 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,6 @@
"autosectionlabel.*",
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Basically, to start the training and obtain a good baseline with the best trade-
(otx) ...$ otx train ... --data_root <path_to_data_root>
After dataset preparation, the training will be started with the middle-sized template to achieve competitive accuracy preserving fast inference.
After dataset preparation, the training will be started with the middle-sized recipe to achieve competitive accuracy preserving fast inference.


Supported dataset formats for each task:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/guide/explanation/additional_features/hpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ ASHA also includes a technique called Hyperband, which is used to determine how
How to configure hyper-parameter optimization
*********************************************

You can configure HPO by modifying the ``hpo_config.yaml`` file. This file contains everything related to HPO, including the hyperparameters to optimize, the HPO algorithm, and more. The ``hpo_config.yaml`` file already exists with default values in the same directory where ``template.yaml`` resides. Here is the default ``hpo_config.yaml`` file for classification:
You can configure HPO by modifying the ``hpo_config.yaml`` file. This file contains everything related to HPO, including the hyperparameters to optimize, the HPO algorithm, and more. The ``hpo_config.yaml`` file already exists with default values in the same directory where ``configs.yaml`` resides. Here is the default ``hpo_config.yaml`` file for classification:

.. code-block::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Models
Currently OpenVINO™ Training Extensions supports `X3D <https://arxiv.org/abs/2004.04730>`_ and `MoViNet <https://arxiv.org/pdf/2103.11511.pdf>`_ for action classification.

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------------------+-------------------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
| Recipe ID | Name | Complexity (GFLOPs) | Model size (MB) |
+========================================================================================================================================================================================+=========+=====================+=========================+
| `Custom_Action_Classification_X3D <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/action/action_classification/x3d.yaml>`_ | X3D | 2.49 | 3.79 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------------------+-------------------------+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ We support the popular action classification formats, `AVA dataset <http://resea
Models
******

We support the following ready-to-use model templates for transfer learning:
We support the following ready-to-use model recipes for transfer learning:

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+---------------------+-------------------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
| Recipe ID | Name | Complexity (GFLOPs) | Model size (MB) |
+=========================================================================================================================================================================================+===============+=====================+=========================+
| `Custom_Action_Detection_X3D_FAST_RCNN <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/action/action_detection/x3d_fast_rcnn.yaml>`_ | x3d_fast_rcnn | 13.04 | 8.32 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+---------------------+-------------------------+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ An example of the annotations format and dataset structure can be found in our `
Models
******

We use the same model templates as for Multi-class Classification. Please, refer: :ref:`Classification Models <classification_models>`.
We use the same model recipes as for Multi-class Classification. Please, refer: :ref:`Classification Models <classification_models>`.

To see which models are available for the task, the following command can be executed:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ Models
******
.. _classification_models:

We support the following ready-to-use model templates:
We support the following ready-to-use model recipes:

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
| Recipe ID | Name | Complexity (GFLOPs) | Model size (MB) |
+==================================================================================================================================================================================================================+=======================+=====================+=================+
| `Custom_Image_Classification_MobileNet-V3-large-1x <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/classification/multi_class_cls/otx_mobilenet_v3_large.yaml>`_ | MobileNet-V3-large-1x | 0.44 | 4.29 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------+-----------------+
Expand All @@ -65,7 +65,7 @@ We support the following ready-to-use model templates:
| `Custom_Image_Classification_EfficientNet-V2-S <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/classification/multi_class_cls/otx_efficientnet_v2.yaml>`_ | EfficientNet-V2-S | 5.76 | 20.23 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------+-----------------+

`EfficientNet-V2-S <https://arxiv.org/abs/2104.00298>`_ has more parameters and Flops and needs more time to train, meanwhile providing superior classification performance. `MobileNet-V3-large-1x <https://arxiv.org/abs/1905.02244>`_ is the best choice when training time and computational cost are in priority, nevertheless, this template provides competitive accuracy as well.
`EfficientNet-V2-S <https://arxiv.org/abs/2104.00298>`_ has more parameters and Flops and needs more time to train, meanwhile providing superior classification performance. `MobileNet-V3-large-1x <https://arxiv.org/abs/1905.02244>`_ is the best choice when training time and computational cost are in priority, nevertheless, this recipe provides competitive accuracy as well.
`EfficientNet-B0 <https://arxiv.org/abs/1905.11946>`_ consumes more Flops compared to MobileNet, providing better performance on large datasets, but may be not so stable in case of a small amount of training data.

To see which models are available for the task, the following command can be executed:
Expand All @@ -74,7 +74,7 @@ To see which models are available for the task, the following command can be exe
(otx) ...$ otx find --task MULTI_CLASS_CLS
In the table below the top-1 accuracy on some academic datasets using our :ref:`supervised pipeline <mcl_cls_supervised_pipeline>` is presented. The results were obtained on our templates without any changes. We use 224x224 image resolution, for other hyperparameters, please, refer to the related template. We trained each model with single Nvidia GeForce RTX3090.
In the table below the top-1 accuracy on some academic datasets using our :ref:`supervised pipeline <mcl_cls_supervised_pipeline>` is presented. The results were obtained on our Recipes without any changes. We use 224x224 image resolution, for other hyperparameters, please, refer to the related recipe. We trained each model with single Nvidia GeForce RTX3090.

+-----------------------+-----------------+-----------+-----------+-----------+
| Model name | CIFAR10 |CIFAR100 |flowers* | cars* |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ To see which models are available for the task, the following command can be exe
(otx) ...$ otx find --task MULTI_LABEL_CLS
In the table below the `mAP <https://en.wikipedia.org/w/index.php?title=Information_retrieval&oldid=793358396#Average_precision>`_ metrics on some academic datasets using our :ref:`supervised pipeline <ml_cls_supervised_pipeline>` are presented. The results were obtained on our templates without any changes (including input resolution, which is 224x224 for all templates). We trained each model with single Nvidia GeForce RTX3090.
In the table below the `mAP <https://en.wikipedia.org/w/index.php?title=Information_retrieval&oldid=793358396#Average_precision>`_ metrics on some academic datasets using our :ref:`supervised pipeline <ml_cls_supervised_pipeline>` are presented. The results were obtained on our recipes without any changes (including input resolution, which is 224x224 for all recipes). We trained each model with single Nvidia GeForce RTX3090.

+-----------------------+-----------------+-----------+------------------+-----------+
| Model name | Pascal-VOC 2007 | COCO 2014 | Aerial Maritime | Mean mAP |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ Learn more about the formats by following the links above. Here is an example of
Models
******

We support the following ready-to-use model templates:
We support the following ready-to-use model recipes:

+------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
| Recipe ID | Name | Complexity (GFLOPs) | Model size (MB) |
+============================================================================================================================================================+=====================+=====================+=================+
| `Custom_Object_Detection_YOLOX <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/detection/yolox_tiny.yaml>`_ | YOLOX-TINY | 6.5 | 20.4 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----------------+
Expand Down Expand Up @@ -96,8 +96,8 @@ Except for COCO, we report AP50 as performance metric.
`BDD100K <https://www.bdd100k.com/>`_ is the largest dataset among we used. 70000 images are used as train images and 10000 images are used for validation.
`Brackish <https://public.roboflow.com/object-detection/brackish-underwater>`_ and `Plantdoc <https://public.roboflow.com/object-detection/plantdoc>`_ are datasets of medium size. They have around 10000 images for train and 1500 images for validation.
`BCCD <https://public.roboflow.com/object-detection/bccd>`_ and `Chess pieces <https://public.roboflow.com/object-detection/chess-full>`_ are datasets of small size. They have around 300 images for train and 100 images for validation.
We used our own templates without any modification.
For hyperparameters, please, refer to the related template.
We used our own recipes without any modification.
For hyperparameters, please, refer to the related recipe.
We trained each model with a single Nvidia GeForce RTX3090.

+----------------------------+------------------+-----------+-----------+-----------+-----------+--------------+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ For the dataset handling inside OpenVINO™ Training Extensions, we use `Dataset
Models
******

We support the following ready-to-use model templates:
We support the following ready-to-use model recipes:

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
| Recipe ID | Name | Complexity (GFLOPs) | Model size (MB) |
+===============================================================================================================================================================================================================+============================+=====================+=================+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_EfficientNetB2B <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/instance_segmentation/maskrcnn_efficientnetb2b.yaml>`_ | MaskRCNN-EfficientNetB2B | 68.48 | 13.27 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
Expand All @@ -72,7 +72,7 @@ On the other hand, MaskRCNN-EfficientNetB2B employs the `EfficientNet-B2 <https:

Recently, we have made updates to MaskRCNN-ConvNeXt, incorporating the `ConvNeXt backbone <https://arxiv.org/abs/2201.03545>`_. Through our experiments, we have observed that this variant achieves better accuracy compared to MaskRCNN-ResNet50 while utilizing less GPU memory. However, it is important to note that the training time and inference duration may slightly increase. If minimizing training time is a significant concern, we recommend considering a switch to MaskRCNN-EfficientNetB2B.

In the table below the `mAP <https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient>`_ metric on some academic datasets using our :ref:`supervised pipeline <instance_segmentation_supervised_pipeline>` is presented. The results were obtained on our templates without any changes. We use 1024x1024 image resolution, for other hyperparameters, please, refer to the related template. We trained each model with single Nvidia GeForce RTX3090.
In the table below the `mAP <https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient>`_ metric on some academic datasets using our :ref:`supervised pipeline <instance_segmentation_supervised_pipeline>` is presented. The results were obtained on our recipes without any changes. We use 1024x1024 image resolution, for other hyperparameters, please, refer to the related recipe. We trained each model with single Nvidia GeForce RTX3090.

+---------------------------+--------------+------------+-----------------+
| Model name | ADE20k | Cityscapes | Pascal-VOC 2007 |
Expand Down
Loading

0 comments on commit b0c8583

Please sign in to comment.