From 5e4cd1127b5ce58cbc8dab89b0dc47a1d1620d70 Mon Sep 17 00:00:00 2001 From: Andrey Zaytsev Date: Thu, 6 May 2021 15:37:13 +0300 Subject: [PATCH] Integrate UAT fixes (#5517) * Added info on DockerHub CI Framework * Feature/azaytsev/change layout (#3295) * Changes according to feedback comments * Replaced @ref's with html links * Fixed links, added a title page for installing from repos and images, fixed formatting issues * Added links * minor fix * Added DL Streamer to the list of components installed by default * Link fixes * Link fixes * ovms doc fix (#2988) * added OpenVINO Model Server * ovms doc fixes Co-authored-by: Trawinski, Dariusz * Updated openvino_docs.xml * Edits to MO Per findings spreadsheet * macOS changes per issue spreadsheet * Fixes from review spreadsheet Mostly IE_DG fixes * Consistency changes * Make doc fixes from last round of review * integrate changes from baychub/master * Update Intro.md * Update Cutting_Model.md * Update Cutting_Model.md * Fixed link to Customize_Model_Optimizer.md Co-authored-by: Trawinski, Dariusz Co-authored-by: baychub --- docs/HOWTO/Custom_Layers_Guide.md | 72 ++++++------------- docs/IE_DG/Bfloat16Inference.md | 2 +- ...Deep_Learning_Inference_Engine_DevGuide.md | 13 ++-- docs/IE_DG/DynamicBatching.md | 2 +- .../IE_DG/Extensibility_DG/AddingNGraphOps.md | 4 +- .../IE_DG/Extensibility_DG/Custom_ONNX_Ops.md | 2 +- docs/IE_DG/Extensibility_DG/Intro.md | 8 +-- docs/IE_DG/Extensibility_DG/VPU_Kernel.md | 5 +- docs/IE_DG/GPU_Kernels_Tuning.md | 9 ++- docs/IE_DG/InferenceEngine_QueryAPI.md | 2 +- docs/IE_DG/Int8Inference.md | 6 +- ...grate_with_customer_application_new_API.md | 39 +++------- docs/IE_DG/Intro_to_Performance.md | 2 +- docs/IE_DG/Introduction.md | 2 +- docs/IE_DG/Memory_primitives.md | 2 +- docs/IE_DG/ONNX_Support.md | 4 +- docs/IE_DG/ShapeInference.md | 2 +- docs/IE_DG/inference_engine_intro.md | 10 +-- docs/IE_DG/network_state_intro.md | 33 ++++----- docs/IE_DG/supported_plugins/GNA.md | 2 +- .../supported_plugins/GPU_RemoteBlob_API.md | 4 +- docs/IE_DG/supported_plugins/HDDL.md | 12 ++-- docs/IE_DG/supported_plugins/HETERO.md | 19 +++-- docs/IE_DG/supported_plugins/MULTI.md | 36 +++++----- docs/IE_DG/supported_plugins/MYRIAD.md | 2 +- .../supported_plugins/Supported_Devices.md | 20 +++--- docs/IE_DG/supported_plugins/VPU.md | 2 +- .../Deep_Learning_Model_Optimizer_DevGuide.md | 2 +- docs/MO_DG/IR_and_opsets.md | 6 +- docs/MO_DG/Known_Issues_Limitations.md | 2 +- .../prepare_model/Config_Model_Optimizer.md | 2 +- .../Model_Optimization_Techniques.md | 14 ++-- .../prepare_model/Model_Optimizer_FAQ.md | 6 +- .../prepare_model/Prepare_Trained_Model.md | 2 +- .../convert_model/Convert_Model_From_Caffe.md | 19 ++--- .../convert_model/Convert_Model_From_Kaldi.md | 14 ++-- .../convert_model/Convert_Model_From_MxNet.md | 8 +-- .../convert_model/Convert_Model_From_ONNX.md | 6 +- .../Convert_Model_From_TensorFlow.md | 56 +++++++-------- .../convert_model/Converting_Model.md | 4 +- .../convert_model/Converting_Model_General.md | 31 ++++---- .../convert_model/Cutting_Model.md | 31 ++++---- .../IR_suitable_for_INT8_inference.md | 8 +-- .../kaldi_specific/Aspire_Tdnn_Model.md | 2 +- .../mxnet_specific/Convert_GluonCV_Models.md | 8 +-- .../Convert_Style_Transfer_From_MXNet.md | 19 ++--- .../onnx_specific/Convert_DLRM.md | 2 +- .../onnx_specific/Convert_GPT2.md | 2 +- .../pytorch_specific/Convert_F3Net.md | 2 +- .../pytorch_specific/Convert_YOLACT.md | 2 +- .../Convert_CRNN_From_Tensorflow.md | 8 +-- .../Convert_DeepSpeech_From_Tensorflow.md | 21 +++--- .../Convert_EfficientDet_Models.md | 11 +-- .../Convert_GNMT_From_Tensorflow.md | 4 +- .../Convert_NCF_From_Tensorflow.md | 25 +++---- .../Convert_Object_Detection_API_Models.md | 5 +- .../Convert_WideAndDeep_Family_Models.md | 3 + .../Convert_XLNet_From_Tensorflow.md | 7 +- .../Convert_YOLO_From_Tensorflow.md | 20 +++--- .../Customize_Model_Optimizer.md | 8 +-- ...Net_Model_Optimizer_with_New_Primitives.md | 9 ++- ...odel_Optimizer_with_Caffe_Python_Layers.md | 7 +- ...ing_Model_Optimizer_with_New_Primitives.md | 4 +- .../Legacy_Mode_for_Caffe_Custom_Layers.md | 4 +- .../Subgraph_Replacement_Model_Optimizer.md | 2 +- docs/get_started/get_started_dl_workbench.md | 6 +- docs/get_started/get_started_linux.md | 30 +++++++- docs/get_started/get_started_macos.md | 33 +++++++-- docs/get_started/get_started_windows.md | 24 ++++++- docs/index.md | 28 ++++---- docs/install_guides/PAC_Configure_2019RX.md | 2 +- .../install_guides/deployment-manager-tool.md | 5 +- .../installing-openvino-linux.md | 9 +-- .../installing-openvino-macos.md | 41 ++++++----- .../installing-openvino-raspbian.md | 24 ++++--- .../installing-openvino-windows.md | 8 +-- docs/nGraph_DG/intro.md | 6 +- docs/nGraph_DG/nGraphTransformation.md | 2 +- docs/nGraph_DG/nGraph_basic_concepts.md | 4 +- .../dldt_optimization_guide.md | 2 +- docs/ovsa/ovsa_get_started.md | 2 +- .../object_detection_sample_ssd/README.md | 2 +- .../ie_bridges/python/docs/api_overview.md | 13 ++-- .../sample/hello_classification/README.md | 4 +- .../sample/hello_query_device/README.md | 5 +- .../object_detection_sample_ssd/README.md | 2 +- .../Offline_speech_recognition_demo.md | 6 +- .../Speech_libs_and_demos.md | 10 +-- .../samples/speech_sample/README.md | 29 ++++---- .../fluid/modules/gapi/doc/10-hld-overview.md | 2 +- inference-engine/tools/compile_tool/README.md | 6 +- 91 files changed, 513 insertions(+), 494 deletions(-) diff --git a/docs/HOWTO/Custom_Layers_Guide.md b/docs/HOWTO/Custom_Layers_Guide.md index 1de91356304e38..13590e5d2027ef 100644 --- a/docs/HOWTO/Custom_Layers_Guide.md +++ b/docs/HOWTO/Custom_Layers_Guide.md @@ -51,65 +51,45 @@ To see the operations that are supported by each device plugin for the Inference ### Custom Operation Support for the Model Optimizer -Model Optimizer model conversion pipeline is described in details in "Model Conversion Pipeline" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). -It is recommended to read that article first for a better understanding of the following material. +Model Optimizer model conversion pipeline is described in detail in "Model Conversion Pipeline" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). It is best to read that article first for a better understanding of the following material. -Model Optimizer provides extensions mechanism to support new operations and implement custom model transformations to -generate optimized IR. This mechanism is described in the "Model Optimizer Extensions" section on the +Model Optimizer provides an extensions mechanism to support new operations and implement custom model transformations to generate optimized IR. This mechanism is described in the "Model Optimizer Extensions" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). -Two types of the Model Optimizer extensions should be implemented to support custom operation at minimum: -1. Operation class for a new operation. This class stores information about the operation, its attributes, shape -inference function, attributes to be saved to an IR and some others internally used attributes. Refer to the -"Model Optimizer Operation" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for the -detailed instruction on how to implement it. +Two types of the Model Optimizer extensions should be implemented to support custom operations, at a minimum: +1. Operation class for a new operation. This class stores information about the operation, its attributes, shape inference function, attributes to be saved to an IR and some others internally used attributes. Refer to the "Model Optimizer Operation" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement it. 2. Operation attributes extractor. The extractor is responsible for parsing framework-specific representation of the operation and uses corresponding operation class to update graph node attributes with necessary attributes of the -operation. Refer to the "Operation Extractor" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for the -detailed instruction on how to implement it. +operation. Refer to the "Operation Extractor" section of +[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement it. -> **NOTE:** In some cases you may need to implement some transformation to support the operation. This topic is covered -> in the "Graph Transformation Extensions" section on the -> [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). +> **NOTE:** In some cases you may need to implement some transformation to support the operation. This topic is covered in the "Graph Transformation Extensions" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). ## Custom Operations Extensions for the Inference Engine -Inference Engine provides extensions mechanism to support new operations. This mechanism is described in the -[Inference Engine Extensibility Mechanism](../IE_DG/Extensibility_DG/Intro.md). +Inference Engine provides extensions mechanism to support new operations. This mechanism is described in [Inference Engine Extensibility Mechanism](../IE_DG/Extensibility_DG/Intro.md). -Each device plugin includes a library of optimized implementations to execute known operations which must be extended to -execute a custom operation. The custom operation extension is implemented according to the target device: +Each device plugin includes a library of optimized implementations to execute known operations which must be extended to execute a custom operation. The custom operation extension is implemented according to the target device: - Custom Operation CPU Extension - A compiled shared library (`.so` or `.dll`) needed by the CPU Plugin for executing the custom operation on a CPU. Refer to the [How to Implement Custom CPU Operations](../IE_DG/Extensibility_DG/CPU_Kernel.md) for more details. - Custom Operation GPU Extension - - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the GPU along with a - operation description file (.xml) needed by the GPU Plugin for the custom operation kernel. Refer to the - [How to Implement Custom GPU Operations](../IE_DG/Extensibility_DG/GPU_Kernel.md) for more details. + - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the GPU along with an operation description file (.xml) needed by the GPU Plugin for the custom operation kernel. Refer to the [How to Implement Custom GPU Operations](../IE_DG/Extensibility_DG/GPU_Kernel.md) for more details. - Custom Operation VPU Extension - - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the VPU along with a - operation description file (.xml) needed by the VPU Plugin for the custom operation kernel. Refer to the - [How to Implement Custom Operations for VPU](../IE_DG/Extensibility_DG/VPU_Kernel.md) for more details. + - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the VPU along with an operation description file (.xml) needed by the VPU Plugin for the custom operation kernel. Refer to [How to Implement Custom Operations for VPU](../IE_DG/Extensibility_DG/VPU_Kernel.md) for more details. -Also, it is necessary to implement nGraph custom operation according to the -[Custom nGraph Operation](../IE_DG/Extensibility_DG/AddingNGraphOps.md) so the Inference Engine can read an IR with this -operation and correctly infer output tensors shape and type. +Also, it is necessary to implement nGraph custom operation according to [Custom nGraph Operation](../IE_DG/Extensibility_DG/AddingNGraphOps.md) so the Inference Engine can read an IR with this +operation and correctly infer output tensor shape and type. ## Enabling Magnetic Resonance Image Reconstruction Model -This chapter provides a step-by-step instruction on how to enable the magnetic resonance image reconstruction model -implemented in the [repository](https://github.com/rmsouza01/Hybrid-CS-Model-MRI/) using a custom operation on CPU. The -example is prepared for a model generated from the repository with hash `2ede2f96161ce70dcdc922371fe6b6b254aafcc8`. +This chapter provides step-by-step instructions on how to enable the magnetic resonance image reconstruction model implemented in the [repository](https://github.com/rmsouza01/Hybrid-CS-Model-MRI/) using a custom operation on CPU. The example is prepared for a model generated from the repository with hash `2ede2f96161ce70dcdc922371fe6b6b254aafcc8`. ### Download and Convert the Model to a Frozen TensorFlow\* Model Format -The original pre-trained model is provided in the hdf5 format which is not supported by OpenVINO directly and needs to -be converted to TensorFlow\* frozen model format first. +The original pre-trained model is provided in the hdf5 format which is not supported by OpenVINO directly and needs to be converted to TensorFlow\* frozen model format first. -1. Download repository `https://github.com/rmsouza01/Hybrid-CS-Model-MRI`:
```bash git clone https://github.com/rmsouza01/Hybrid-CS-Model-MRI git checkout 2ede2f96161ce70dcdc922371fe6b6b254aafcc8 @@ -231,15 +211,11 @@ model. The implementation of the Model Optimizer operation should be saved to `m The attribute `inverse` is a flag specifying type of the FFT to apply: forward or inverse. -See the "Model Optimizer Operation" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for the -detailed instruction on how to implement the operation. +See the "Model Optimizer Operation" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement the operation. Now it is necessary to implement extractor for the "IFFT2D" operation according to the -"Operation Extractor" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). The -following snippet provides two extractors: one for "IFFT2D", another one for "FFT2D", however only on of them is used -in this example. The implementation should be saved to the file `mo_extensions/front/tf/FFT_ext.py`. +"Operation Extractor" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). The +following snippet provides two extractors: one for "IFFT2D", another one for "FFT2D", however only on of them is used in this example. The implementation should be saved to the file `mo_extensions/front/tf/FFT_ext.py`. @snippet FFT_ext.py fft_ext:extractor @@ -255,8 +231,7 @@ consumed with the "Complex" operation to produce a tensor of complex numbers. Th operations can be removed so the "FFT" operation will get a real value tensor encoding complex numbers. To achieve this we implement the front phase transformation which searches for a pattern of two "StridedSlice" operations with specific attributes producing data to "Complex" operation and removes it from the graph. Refer to the -"Pattern-Defined Front Phase Transformations" section on the -[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for more +"Pattern-Defined Front Phase Transformations" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for more information on how this type of transformation works. The code snippet should be saved to the file `mo_extensions/front/tf/Complex.py`. @@ -284,7 +259,7 @@ Now it is possible to convert the model using the following command line: .//mo.py --input_model /wnet_20.pb -b 1 --extensions mo_extensions/ ``` -The sub-graph corresponding to the originally non-supported one is depicted on the image below: +The sub-graph corresponding to the originally non-supported one is depicted in the image below: ![Converted sub-graph](img/converted_subgraph.png) @@ -293,8 +268,7 @@ The sub-graph corresponding to the originally non-supported one is depicted on t ### Inference Engine Extension Implementation Now it is necessary to implement the extension for the CPU plugin with operation "FFT" introduced previously. The code -below is based on the template extension described on the -[Inference Engine Extensibility Mechanism](../IE_DG/Extensibility_DG/Intro.md). +below is based on the template extension described in [Inference Engine Extensibility Mechanism](../IE_DG/Extensibility_DG/Intro.md). #### CMake Build File The first step is to create a CMake configuration file which builds the extension. The content of the "CMakeLists.txt" @@ -334,7 +308,7 @@ The last step is to create an extension library "extension.cpp" and "extension.h operation for the CPU plugin. The code of the library is described in the [Extension Library](../IE_DG/Extensibility_DG/Extension.md). ### Building and Running the Custom Extension -In order to build the extension run the following:
+To build the extension, run the following:
```bash mkdir build && cd build source /opt/intel/openvino_2021/bin/setupvars.sh diff --git a/docs/IE_DG/Bfloat16Inference.md b/docs/IE_DG/Bfloat16Inference.md index 136607af8ad435..0461c6ee2b7a3b 100644 --- a/docs/IE_DG/Bfloat16Inference.md +++ b/docs/IE_DG/Bfloat16Inference.md @@ -15,7 +15,7 @@ Preserving the exponent bits keeps BF16 to the same range as the FP32 (~1e-38 to Truncated mantissa leads to occasionally less precision, but according to [investigations](https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus), neural networks are more sensitive to the size of the exponent than the mantissa size. Also, in lots of models, precision is needed close to zero but not so much at the maximum range. Another useful feature of BF16 is possibility to encode INT8 in BF16 without loss of accuracy, because INT8 range completely fits in BF16 mantissa field. It reduces data flow in conversion from INT8 input image data to BF16 directly without intermediate representation in FP32, or in combination of [INT8 inference](Int8Inference.md) and BF16 layers. -See the [Intel's site](https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf) for more bfloat16 format details. +See the ["BFLOAT16 – Hardware Numerics Definition" white paper"](https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf) for more bfloat16 format details. There are two ways to check if CPU device can support bfloat16 computations for models: 1. Query the instruction set via system `lscpu | grep avx512_bf16` or `cat /proc/cpuinfo | grep avx512_bf16`. diff --git a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md index 89997e0f0ce5a1..5fc2b3f910255f 100644 --- a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md +++ b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md @@ -1,11 +1,10 @@ # Inference Engine Developer Guide {#openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide} -> **NOTE:** [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019). +> **NOTE:** [Intel® System Studio](https://software.intel.com/content/www/us/en/develop/tools/oneapi/commercial-base-iot.html) (click "Intel® System Studio Users" tab) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019). -This Guide provides an overview of the Inference Engine describing the typical workflow for performing -inference of a pre-trained and optimized deep learning model and a set of sample applications. +This Guide provides an overview of the Inference Engine describing the typical workflow for performing inference of a pre-trained and optimized deep learning model and a set of sample applications. -> **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in run-time using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_group_intel). +> **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in runtime using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_group_intel). After you have used the Model Optimizer to create an Intermediate Representation (IR), use the Inference Engine to infer the result for a given input data. @@ -22,7 +21,7 @@ For complete API Reference, see the [Inference Engine API References](./api_refe Inference Engine uses a plugin architecture. Inference Engine plugin is a software component that contains complete implementation for inference on a certain Intel® hardware device: CPU, GPU, VPU, etc. Each plugin implements the unified API and provides additional hardware-specific APIs. ## Modules in the Inference Engine component -### Core Inference Engine Libraries ### +### Core Inference Engine Libraries Your application must link to the core Inference Engine libraries: * Linux* OS: @@ -39,7 +38,7 @@ This library contains the classes to: * Manipulate network information (InferenceEngine::CNNNetwork) * Execute and pass inputs and outputs (InferenceEngine::ExecutableNetwork and InferenceEngine::InferRequest) -### Plugin Libraries to Read a Network Object ### +### Plugin Libraries to Read a Network Object Starting from 2020.4 release, Inference Engine introduced a concept of `CNNNetwork` reader plugins. Such plugins can be automatically dynamically loaded by Inference Engine in runtime depending on file format: * Linux* OS: @@ -49,7 +48,7 @@ Starting from 2020.4 release, Inference Engine introduced a concept of `CNNNetwo - `inference_engine_ir_reader.dll` to read a network from IR - `inference_engine_onnx_reader.dll` to read a network from ONNX model format -### Device-Specific Plugin Libraries ### +### Device-Specific Plugin Libraries For each supported target device, Inference Engine provides a plugin — a DLL/shared library that contains complete implementation for inference on this particular device. The following plugins are available: diff --git a/docs/IE_DG/DynamicBatching.md b/docs/IE_DG/DynamicBatching.md index a05c218b6193e3..67475f0b83f3ba 100644 --- a/docs/IE_DG/DynamicBatching.md +++ b/docs/IE_DG/DynamicBatching.md @@ -1,7 +1,7 @@ Using Dynamic Batching {#openvino_docs_IE_DG_DynamicBatching} ====================== -Dynamic Batching feature allows you+ to dynamically change batch size for inference calls +Dynamic Batching feature allows you to dynamically change batch size for inference calls within preset batch size limit. This feature might be useful when batch size is unknown beforehand, and using extra large batch size is undesired or impossible due to resource limitations. diff --git a/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md b/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md index e98edf7e8f0acc..d3b0714ea4496c 100644 --- a/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md +++ b/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md @@ -1,10 +1,10 @@ # Custom nGraph Operation {#openvino_docs_IE_DG_Extensibility_DG_AddingNGraphOps} -Inference Engine Extension API enables you to register operation sets (opsets) with custom nGraph operations to support models with operations which OpenVINO™ does not support out-of-the-box. +Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations which OpenVINO™ does not support out-of-the-box. ## Operation Class -To add your custom nGraph operation, create a new class that extends `ngraph::Op`, which is in turn derived from `ngraph::Node`, the base class for all graph operations in nGraph. Follow the steps below: +To add your custom nGraph operation, create a new class that extends `ngraph::Op`, which is in turn derived from `ngraph::Node`, the base class for all graph operations in nGraph. Follow the steps below to add a custom nGraph operation: 1. Add the `NGRAPH_RTTI_DECLARATION` and `NGRAPH_RTTI_DEFINITION` macros which define a `NodeTypeInfo` object that identifies the type of the operation to the graph users and helps with dynamic type resolution. The type info of an nGraph operation currently consists of a string identifier and a version number, but this may change in the future. diff --git a/docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md b/docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md index e0cdb7cc584070..252d67df81f99e 100644 --- a/docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md +++ b/docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md @@ -46,7 +46,7 @@ Here, the `register_operator` function is called in the constructor of Extension The example below demonstrates how to unregister an operator from the destructor of Extension: @snippet template_extension/extension.cpp extension:dtor -> **NOTE**: It is mandatory to unregister a custom ONNX operator if it is defined in a dynamic shared library. +> **REQUIRED**: It is mandatory to unregister a custom ONNX operator if it is defined in a dynamic shared library. ## Requirements for Building with CMake diff --git a/docs/IE_DG/Extensibility_DG/Intro.md b/docs/IE_DG/Extensibility_DG/Intro.md index 525462043411a4..aa2e7d87ba38ea 100644 --- a/docs/IE_DG/Extensibility_DG/Intro.md +++ b/docs/IE_DG/Extensibility_DG/Intro.md @@ -21,16 +21,14 @@ Inference Engine Extension dynamic library contains the following components: - Enables the creation of `ngraph::Function` with unsupported operations. - Provides a shape inference mechanism for custom operations. -> **NOTE**: This documentation is written based on the `Template extension`, which demonstrates extension -development details. Find the complete code of the `Template extension`, which is fully compilable and up-to-date, -at `/docs/template_extension`. +> **NOTE**: This documentation is written based on the `Template extension`, which demonstrates extension development details. Find the complete code of the `Template extension`, which is fully compilable and up-to-date, at `/docs/template_extension`. ## Execution Kernels The Inference Engine workflow involves the creation of custom kernels and either custom or existing operations. -An _Operation_ is a network building block implemented in the training framework, for example, `Convolution` in Caffe*. -A _Kernel_ is defined as the corresponding implementation in the Inference Engine. +An _operation_ is a network building block implemented in the training framework, for example, `Convolution` in Caffe*. +A _kernel_ is defined as the corresponding implementation in the Inference Engine. Refer to the [Model Optimizer Extensibility](../../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for details on how a mapping between framework operations and Inference Engine kernels is registered. diff --git a/docs/IE_DG/Extensibility_DG/VPU_Kernel.md b/docs/IE_DG/Extensibility_DG/VPU_Kernel.md index ec102b1b51cfdc..033097598317bf 100644 --- a/docs/IE_DG/Extensibility_DG/VPU_Kernel.md +++ b/docs/IE_DG/Extensibility_DG/VPU_Kernel.md @@ -17,8 +17,7 @@ OpenCL support is provided by ComputeAorta*, and is distributed under a license The OpenCL toolchain for the Intel® Neural Compute Stick 2 supports offline compilation only, so first compile OpenCL C code using the standalone `clc` compiler. You can find the compiler binary at `/deployment_tools/tools/cl_compiler`. -> **NOTE:** By design, custom OpenCL layers support any OpenCL kernels written with 1.2 version assumed. It also supports half float -extension and is optimized for this type, because it is a native type for Intel® Movidius™ VPUs. +> **NOTE:** By design, custom OpenCL layers support any OpenCL kernels written with 1.2 version assumed. It also supports half float extension and is optimized for this type, because it is a native type for Intel® Movidius™ VPUs. 1. Prior to running a compilation, make sure that the following variables are set: * `SHAVE_MA2X8XLIBS_DIR=/deployment_tools/tools/cl_compiler/lib/` @@ -224,7 +223,7 @@ Here is a short list of optimization tips: annotate the code with pragmas as appropriate. The `ocl_grn` version with `#‍pragma unroll 4` is up to 50% faster, most of which comes from unrolling the first loop, because LLVM, in general, is better in scheduling 3-stage loops (load-compute-store), while the fist loop `variance += (float)(src_data[c*H*W + y*W + x] * src_data[c*H*W + y*W + x]);` is only 2-stage (load-compute). Pay attention to unrolling such cases first. Unrolling factor is loop-dependent. Choose the smallest number that -still improves performance as an optimum between the kernel size and execution speed. For this specific kernel, changing the unroll factor from `4`to `6` results in the same performance, so unrolling factor equal to 4 is an optimum. For Intel® Neural Compute Stick 2, unrolling is conjugated with the automatic software pipelining for load, store, and compute stages: +still improves performance as an optimum between the kernel size and execution speed. For this specific kernel, changing the unroll factor from `4` to `6` results in the same performance, so unrolling factor equal to 4 is an optimum. For Intel® Neural Compute Stick 2, unrolling is conjugated with the automatic software pipelining for load, store, and compute stages: ```cpp __kernel void ocl_grn(__global const half* restrict src_data, __global half* restrict dst_data, int C, float bias) { diff --git a/docs/IE_DG/GPU_Kernels_Tuning.md b/docs/IE_DG/GPU_Kernels_Tuning.md index 4bbe315e42c2f3..5bb6a8334b2372 100644 --- a/docs/IE_DG/GPU_Kernels_Tuning.md +++ b/docs/IE_DG/GPU_Kernels_Tuning.md @@ -10,11 +10,10 @@ tuning for new kind of models, hardwares or drivers. ## Tuned data -GPU tuning data is saved in JSON format. -File's content is composed of 2 types of attributes and 1 type of value: -1. Execution units number - this attribute splits the content into different EU sections. -2. Hash - hashed tuned kernel data. -Key: Array with kernel name and kernel's mode index. +GPU tuning data is saved in JSON format. The file is composed of 2 types of attributes and 1 type of value: +* Execution units number (attribute): splits the content into different EU sections +* Hash (attribute): hashed tuned kernel data +* Key (value): Array with kernel name and kernel's mode index ## Usage diff --git a/docs/IE_DG/InferenceEngine_QueryAPI.md b/docs/IE_DG/InferenceEngine_QueryAPI.md index 60497bbebdf362..8588e00e5ceb62 100644 --- a/docs/IE_DG/InferenceEngine_QueryAPI.md +++ b/docs/IE_DG/InferenceEngine_QueryAPI.md @@ -57,7 +57,7 @@ For documentation about common configuration keys, refer to `ie_plugin_config.hp @snippet snippets/InferenceEngine_QueryAPI2.cpp part2 -A returned value looks as follows: `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz`. +A returned value appears as follows: `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz`. > **NOTE**: All metrics have specific type, which is specified during metric instantiation. The list of common device-agnostic metrics can be found in `ie_plugin_config.hpp`. Device specific metrics (for example, for `HDDL`, `MYRIAD` devices) can be found in corresponding plugin folders. diff --git a/docs/IE_DG/Int8Inference.md b/docs/IE_DG/Int8Inference.md index 1f580bbd4e2c1a..917c7836de293b 100644 --- a/docs/IE_DG/Int8Inference.md +++ b/docs/IE_DG/Int8Inference.md @@ -27,7 +27,7 @@ Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github ```sh ./downloader.py --name resnet-50-tf --precisions FP16-INT8 ``` -After that you should quantize model by [Model Quantizer](@ref omz_tools_downloader) tool. +After that you should quantize model by the [Model Quantizer](@ref omz_tools_downloader) tool. ```sh ./quantizer.py --model_dir public/resnet-50-tf --dataset_dir --precisions=FP16-INT8 ``` @@ -35,7 +35,7 @@ The simplest way to infer the model and collect performance counters is [C++ Ben ```sh ./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir ``` -If you infer the model in the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision. +If you infer the model with the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision. ## Low-Precision 8-bit Integer Inference Workflow @@ -46,7 +46,7 @@ For 8-bit integer computations, a model must be quantized. Quantized models can When you pass the quantized IR to the OpenVINO™ plugin, the plugin automatically recognizes it as a quantized model and performs 8-bit inference. Note, if you pass a quantized model to another plugin that does not support 8-bit inference but supports all operations from the model, the model is inferred in precision that this plugin supports. -2. *Run-time stage*. This stage is an internal procedure of the OpenVINO™ plugin. During this stage, the quantized model is loaded to the plugin. The plugin uses `Low Precision Transformation` component to update the model to infer it in low precision: +2. *Runtime stage*. This stage is an internal procedure of the OpenVINO™ plugin. During this stage, the quantized model is loaded to the plugin. The plugin uses `Low Precision Transformation` component to update the model to infer it in low precision: - Update `FakeQuantize` layers to have quantized output tensors in low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers have quantized input tensors in low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer. - Weights are quantized and stored in `Constant` layers. diff --git a/docs/IE_DG/Integrate_with_customer_application_new_API.md b/docs/IE_DG/Integrate_with_customer_application_new_API.md index 27cc6b7e32ed53..9e35f483717433 100644 --- a/docs/IE_DG/Integrate_with_customer_application_new_API.md +++ b/docs/IE_DG/Integrate_with_customer_application_new_API.md @@ -105,34 +105,21 @@ methods: @snippet snippets/Integrate_with_customer_application_new_API.cpp part7 6) **Prepare input**. You can use one of the following options to prepare input: - * **Optimal way for a single network.** Get blobs allocated by an infer request using `InferenceEngine::InferRequest::GetBlob()` - and feed an image and the input data to the blobs. In this case, input data must be aligned (resized manually) with a - given blob size and have a correct color format. + * **Optimal way for a single network.** Get blobs allocated by an infer request using `InferenceEngine::InferRequest::GetBlob()` and feed an image and the input data to the blobs. In this case, input data must be aligned (resized manually) with a given blob size and have a correct color format. @snippet snippets/Integrate_with_customer_application_new_API.cpp part8 - * **Optimal way for a cascade of networks (output of one network is input for another).** Get output blob from the first - request using `InferenceEngine::InferRequest::GetBlob()` and set it as input for the second request using - `InferenceEngine::InferRequest::SetBlob()`. + * **Optimal way for a cascade of networks (output of one network is input for another).** Get output blob from the first request using `InferenceEngine::InferRequest::GetBlob()` and set it as input for the second request using `InferenceEngine::InferRequest::SetBlob()`. @snippet snippets/Integrate_with_customer_application_new_API.cpp part9 - * **Optimal way to handle ROI (a ROI object located inside of input of one network is input for another).** It is - possible to re-use shared input by several networks. You do not need to allocate separate input blob for a network if - it processes a ROI object located inside of already allocated input of a previous network. For instance, when first - network detects objects on a video frame (stored as input blob) and second network accepts detected bounding boxes - (ROI inside of the frame) as input. - In this case, it is allowed to re-use pre-allocated input blob (used by first network) by second network and just crop - ROI without allocation of new memory using `InferenceEngine::make_shared_blob()` with passing of - `InferenceEngine::Blob::Ptr` and `InferenceEngine::ROI` as parameters. + * **Optimal way to handle ROI (a ROI object located inside of input of one network is input for another).** It is possible to re-use shared input by several networks. You do not need to allocate separate input blob for a network if it processes a ROI object located inside of already allocated input of a previous network. For instance, when first network detects objects on a video frame (stored as input blob) and second network accepts detected bounding boxes (ROI inside of the frame) as input. In this case, it is allowed to re-use pre-allocated input blob (used by first network) by second network and just crop ROI without allocation of new memory using `InferenceEngine::make_shared_blob()` with passing of `InferenceEngine::Blob::Ptr` and `InferenceEngine::ROI` as parameters. @snippet snippets/Integrate_with_customer_application_new_API.cpp part10 - Make sure that shared input is kept valid during execution of each network. Otherwise, ROI blob may be corrupted if the - original input blob (that ROI is cropped from) has already been rewritten. +Make sure that shared input is kept valid during execution of each network. Otherwise, ROI blob may be corrupted if the original input blob (that ROI is cropped from) has already been rewritten. - * Allocate input blobs of the appropriate types and sizes, feed an image and the input data to the blobs, and call - `InferenceEngine::InferRequest::SetBlob()` to set these blobs for an infer request: + * Allocate input blobs of the appropriate types and sizes, feed an image and the input data to the blobs, and call `InferenceEngine::InferRequest::SetBlob()` to set these blobs for an infer request: @snippet snippets/Integrate_with_customer_application_new_API.cpp part11 @@ -140,7 +127,7 @@ methods: > **NOTE:** > -> * `SetBlob()` method compares precision and layout of an input blob with ones defined on step 3 and +> * The `SetBlob()` method compares precision and layout of an input blob with the ones defined in step 3 and > throws an exception if they do not match. It also compares a size of the input blob with input > size of the read network. But if input was configured as resizable, you can set an input blob of > any size (for example, any ROI blob). Input resize will be invoked automatically using resize @@ -154,8 +141,7 @@ methods: > corresponding values of the read network. No pre-processing will happen for this blob. If you > call `GetBlob()` after `SetBlob()`, you will get the blob you set in `SetBlob()`. -7) **Do inference** by calling the `InferenceEngine::InferRequest::StartAsync` and `InferenceEngine::InferRequest::Wait` -methods for asynchronous request: +7) **Do inference** by calling the `InferenceEngine::InferRequest::StartAsync` and `InferenceEngine::InferRequest::Wait` methods for asynchronous request: @snippet snippets/Integrate_with_customer_application_new_API.cpp part12 @@ -164,12 +150,10 @@ or by calling the `InferenceEngine::InferRequest::Infer` method for synchronous @snippet snippets/Integrate_with_customer_application_new_API.cpp part13 `StartAsync` returns immediately and starts inference without blocking main thread, `Infer` blocks - main thread and returns when inference is completed. -Call `Wait` for waiting result to become available for asynchronous request. + main thread and returns when inference is completed. Call `Wait` for waiting result to become available for asynchronous request. There are three ways to use it: -* specify maximum duration in milliseconds to block for. The method is blocked until the specified timeout has elapsed, -or the result becomes available, whichever comes first. +* specify maximum duration in milliseconds to block for. The method is blocked until the specified timeout has elapsed, or the result becomes available, whichever comes first. * `InferenceEngine::InferRequest::WaitMode::RESULT_READY` - waits until inference result becomes available * `InferenceEngine::InferRequest::WaitMode::STATUS_ONLY` - immediately returns request status.It does not block or interrupts current thread. @@ -182,8 +166,7 @@ While request is ongoing, all its methods except `InferenceEngine::InferRequest: exception. 8) Go over the output blobs and **process the results**. -Note that casting `Blob` to `TBlob` via `std::dynamic_pointer_cast` is not recommended way, -better to access data via `buffer()` and `as()` methods as follows: +Note that casting `Blob` to `TBlob` via `std::dynamic_pointer_cast` is not the recommended way. It's better to access data via the `buffer()` and `as()` methods as follows: @snippet snippets/Integrate_with_customer_application_new_API.cpp part14 @@ -217,7 +200,7 @@ add_executable(${PROJECT_NAME} src/main.cpp) target_link_libraries(${PROJECT_NAME} PRIVATE ${InferenceEngine_LIBRARIES} ${OpenCV_LIBS} ${NGRAPH_LIBRARIES}) ``` 3. **To build your project** using CMake with the default build tools currently available on your machine, execute the following commands: -> **NOTE**: Make sure **Set the Environment Variables** step in [OpenVINO Installation](../../inference-engine/samples/hello_nv12_input_classification/README.md) document is applied to your terminal, otherwise `InferenceEngine_DIR` and `OpenCV_DIR` variables won't be configured properly to pass `find_package` calls. +> **NOTE**: Make sure you set environment variables first by running `/bin/setupvars.sh` (or setupvars.bat for Windows)`. Otherwise the `InferenceEngine_DIR` and `OpenCV_DIR` variables won't be configured properly to pass `find_package` calls. ```sh cd build/ cmake ../project diff --git a/docs/IE_DG/Intro_to_Performance.md b/docs/IE_DG/Intro_to_Performance.md index 6dbdd35cef47da..66fcf48c34f3c5 100644 --- a/docs/IE_DG/Intro_to_Performance.md +++ b/docs/IE_DG/Intro_to_Performance.md @@ -14,7 +14,7 @@ You can find more information, including preferred data types for specific devic ## Lowering Inference Precision Default optimization is used for CPU and implies that inference is made with lower precision if it is possible on a given platform to reach better performance with acceptable range of accuracy. -This approach is used for CPU device if platform supports the AVX512_BF16 instruction. In this case, a regular float32 model is converted to [bfloat16](Bfloat16Inference.md) internal representation and inference is provided with bfloat16 layers usage. +This approach can be used for CPU devices where the platform supports the AVX512_BF16 instruction. In this case, a regular float32 model is converted to [bfloat16](Bfloat16Inference.md) internal representation and inference is provided with bfloat16 layers usage. Below is the example command line to disable this feature on the CPU device with the AVX512_BF16 instruction and execute regular float32. ``` $ benchmark_app -m -enforcebf16=false diff --git a/docs/IE_DG/Introduction.md b/docs/IE_DG/Introduction.md index 6d3d5be66c608b..1682ec7466e6a4 100644 --- a/docs/IE_DG/Introduction.md +++ b/docs/IE_DG/Introduction.md @@ -92,7 +92,7 @@ Refer to a dedicated description about [Intermediate Representation and Operatio ## nGraph Integration OpenVINO toolkit is powered by nGraph capabilities for Graph construction API, Graph transformation engine and Reshape. -nGraph Function is used as an intermediate representation for a model in the run-time underneath the CNNNetwork API. +nGraph Function is used as an intermediate representation for a model in the runtime underneath the CNNNetwork API. The conventional representation for CNNNetwork is still available if requested for backward compatibility when some conventional API methods are used. Please refer to the [Overview of nGraph](../nGraph_DG/nGraph_dg.md) describing the details of nGraph representation. diff --git a/docs/IE_DG/Memory_primitives.md b/docs/IE_DG/Memory_primitives.md index a6fed433d3c765..507757ee650ae8 100644 --- a/docs/IE_DG/Memory_primitives.md +++ b/docs/IE_DG/Memory_primitives.md @@ -8,7 +8,7 @@ Using this class you can read and write memory, get information about the memory The right way to create Blob objects with a specific layout is to use constructors with InferenceEngine::TensorDesc.
-InferenceEngige::TensorDesc tdesc(FP32, {1, 3, 227, 227}, InferenceEngine::Layout::NCHW);
+InferenceEngine::TensorDesc tdesc(FP32, {1, 3, 227, 227}, InferenceEngine::Layout::NCHW);
 InferenceEngine::Blob::Ptr blob = InferenceEngine::make_shared_blob(tdesc);
 
diff --git a/docs/IE_DG/ONNX_Support.md b/docs/IE_DG/ONNX_Support.md index 80afe82df44e32..5b85b9185f0c24 100644 --- a/docs/IE_DG/ONNX_Support.md +++ b/docs/IE_DG/ONNX_Support.md @@ -40,8 +40,8 @@ The described mechanism is the only possibility to read weights from external fi * `const std::string& binPath` * `const Blob::CPtr& weights` -You can find more details about external data mechanism in [ONNX documentation](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md). -To convert a model to use external data feature, you can use [ONNX helpers functions](https://github.com/onnx/onnx/blob/master/onnx/external_data_helper.py). +You can find more details about the external data mechanism in [ONNX documentation](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md). +To convert a model to use the external data feature, you can use [ONNX helper functions](https://github.com/onnx/onnx/blob/master/onnx/external_data_helper.py). **Unsupported types of tensors:** diff --git a/docs/IE_DG/ShapeInference.md b/docs/IE_DG/ShapeInference.md index 93b27c621b50ce..dcc4b5c3f8837b 100644 --- a/docs/IE_DG/ShapeInference.md +++ b/docs/IE_DG/ShapeInference.md @@ -34,7 +34,7 @@ If a model has a hard-coded batch dimension, use `InferenceEngine::CNNNetwork::s Inference Engine takes three kinds of a model description as an input, which are converted into an `InferenceEngine::CNNNetwork` object: 1. [Intermediate Representation (IR)](../MO_DG/IR_and_opsets.md) through `InferenceEngine::Core::ReadNetwork` 2. [ONNX model](../IE_DG/OnnxImporterTutorial.md) through `InferenceEngine::Core::ReadNetwork` -3. [nGraph::Function](../nGraph_DG/nGraph_dg.md) through the constructor of `InferenceEngine::CNNNetwork` +3. [nGraph function](../nGraph_DG/nGraph_dg.md) through the constructor of `InferenceEngine::CNNNetwork` `InferenceEngine::CNNNetwork` keeps an `ngraph::Function` object with the model description internally. The object should have fully defined input shapes to be successfully loaded to the Inference Engine plugins. diff --git a/docs/IE_DG/inference_engine_intro.md b/docs/IE_DG/inference_engine_intro.md index 717813cdf764dd..847c0a59e354d9 100644 --- a/docs/IE_DG/inference_engine_intro.md +++ b/docs/IE_DG/inference_engine_intro.md @@ -100,18 +100,18 @@ The common workflow contains the following steps: 3. **Prepare inputs and outputs format** - After loading the network, specify input and output precision and the layout on the network. For these specification, use the `InferenceEngine::CNNNetwork::getInputsInfo()` and `InferenceEngine::CNNNetwork::getOutputsInfo()`. -4. Pass per device loading configurations specific to this device (`InferenceEngine::Core::SetConfig`), and register extensions to this device (`InferenceEngine::Core::AddExtension`). +4. **Pass per device loading configurations** specific to this device (`InferenceEngine::Core::SetConfig`) and register extensions to this device (`InferenceEngine::Core::AddExtension`). -4. **Compile and Load Network to device** - Use the `InferenceEngine::Core::LoadNetwork()` method with specific device (e.g. `CPU`, `GPU`, etc.) to compile and load the network on the device. Pass in the per-target load configuration for this compilation and load operation. +5. **Compile and Load Network to device** - Use the `InferenceEngine::Core::LoadNetwork()` method with specific device (e.g. `CPU`, `GPU`, etc.) to compile and load the network on the device. Pass in the per-target load configuration for this compilation and load operation. -5. **Set input data** - With the network loaded, you have an `InferenceEngine::ExecutableNetwork` object. Use this object to create an `InferenceEngine::InferRequest` in which you signal the input buffers to use for input and output. Specify a device-allocated memory and copy it into the device memory directly, or tell the device to use your application memory to save a copy. +6. **Set input data** - With the network loaded, you have an `InferenceEngine::ExecutableNetwork` object. Use this object to create an `InferenceEngine::InferRequest` in which you signal the input buffers to use for input and output. Specify a device-allocated memory and copy it into the device memory directly, or tell the device to use your application memory to save a copy. -6. **Execute** - With the input and output memory now defined, choose your execution mode: +7. **Execute** - With the input and output memory now defined, choose your execution mode: * Synchronously - `InferenceEngine::InferRequest::Infer()` method. Blocks until inference is completed. * Asynchronously - `InferenceEngine::InferRequest::StartAsync()` method. Check status with the `InferenceEngine::InferRequest::Wait()` method (0 timeout), wait, or specify a completion callback. -7. **Get the output** - After inference is completed, get the output memory or read the memory you provided earlier. Do this with the `InferenceEngine::InferRequest::GetBlob()` method. +8. **Get the output** - After inference is completed, get the output memory or read the memory you provided earlier. Do this with the `InferenceEngine::IInferRequest::GetBlob()` method. Further Reading diff --git a/docs/IE_DG/network_state_intro.md b/docs/IE_DG/network_state_intro.md index e55b081a9dd97a..778cc2c29b3df4 100644 --- a/docs/IE_DG/network_state_intro.md +++ b/docs/IE_DG/network_state_intro.md @@ -7,7 +7,7 @@ This section describes how to work with stateful networks in OpenVINO toolkit, s The section additionally provides small examples of stateful network and code to infer it. -## What is a stateful network +## What is a Stateful Network Several use cases require processing of data sequences. When length of a sequence is known and small enough, we can process it with RNN like networks that contain a cycle inside. But in some cases, like online speech recognition of time series @@ -21,7 +21,7 @@ The section additionally provides small examples of stateful network and code to OpenVINO also contains special API to simplify work with networks with states. State is automatically saved between inferences, and there is a way to reset state when needed. You can also read state or set it to some new value between inferences. -## OpenVINO state representation +## OpenVINO State Representation OpenVINO contains a special abstraction `Variable` to represent a state in a network. There are two operations to work with the state: * `Assign` to save value in state @@ -30,14 +30,13 @@ The section additionally provides small examples of stateful network and code to You can find more details on these operations in [ReadValue specification](../ops/infrastructure/ReadValue_3.md) and [Assign specification](../ops/infrastructure/Assign_3.md). -## Examples of representation of a network with states +## Examples of Representation of a Network with States + +To get a model with states ready for inference, you can convert a model from another framework to IR with Model Optimizer or create an nGraph function (details can be found in [Build nGraph Function section](../nGraph_DG/build_function.md)). Let's represent the following graph in both forms: -To get a model with states ready for inference, you can convert a model from another framework to IR with Model Optimizer or create an nGraph function -(details can be found in [Build nGraph Function section](../nGraph_DG/build_function.md)). -Let's represent the following graph in both forms: ![state_network_example] -### Example of IR with state +### Example of IR with State The `bin` file for this graph should contain float 0 in binary form. Content of `xml` is the following. @@ -150,7 +149,7 @@ The `bin` file for this graph should contain float 0 in binary form. Content of ``` -### Example of creating model nGraph API +### Example of Creating Model nGraph API ```cpp #include @@ -182,8 +181,7 @@ sink from `ngraph::Function` after deleting the node from graph with the `delete ## OpenVINO state API - Inference Engine has the `InferRequest::QueryState` method to get the list of states from a network and `IVariableState` interface to operate with states. Below you can find brief description of methods and the workable example of how to use this interface. - is below and next section contains small workable example how this interface can be used. + Inference Engine has the `InferRequest::QueryState` method to get the list of states from a network and `IVariableState` interface to operate with states. Below you can find brief description of methods and the workable example of how to use this interface. * `std::string GetName() const` returns name(variable_id) of according Variable @@ -194,7 +192,7 @@ sink from `ngraph::Function` after deleting the node from graph with the `delete * `Blob::CPtr GetState() const` returns current value of state -## Example of stateful network inference +## Example of Stateful Network Inference Let's take an IR from the previous section example. The example below demonstrates inference of two independent sequences of data. State should be reset between these sequences. @@ -211,7 +209,7 @@ Decsriptions can be found in [Samples Overview](./Samples_Overview.md) [state_network_example]: ./img/state_network_example.png -## LowLatency transformation +## LowLatency Transformation If the original framework does not have a special API for working with states, after importing the model, OpenVINO representation will not contain Assign/ReadValue layers. For example, if the original ONNX model contains RNN operations, IR will contain TensorIterator operations and the values will be obtained only after the execution of whole TensorIterator primitive, intermediate values from each iteration will not be available. To be able to work with these intermediate values of each iteration and receive them with a low latency after each infer request, a special LowLatency transformation was introduced. @@ -221,15 +219,14 @@ LowLatency transformation changes the structure of the network containing [Tenso After applying the transformation, ReadValue operations can receive other operations as an input, as shown in the picture above. These inputs should set the initial value for initialization of ReadValue operations. However, such initialization is not supported in the current State API implementation. Input values are ignored and the initial values for the ReadValue operations are set to zeros unless otherwise specified by the user via [State API](#openvino-state-api). -### Steps to apply LowLatency transformation +### Steps to apply LowLatency Transformation -1. Get CNNNetwork. Any way is acceptable: +1. Get CNNNetwork. Either way is acceptable: - * [from IR or ONNX model](Integrate_with_customer_application_new_API.md#integration-steps) + * [from IR or ONNX model](./Integrate_with_customer_application_new_API.md) * [from nGraph Function](../nGraph_DG/build_function.md) -2. [Reshape](ShapeInference) CNNNetwork network if necessary -**Necessary case:** the sequence_lengths dimension of input > 1, it means the TensorIterator layer will have number_iterations > 1. We should reshape the inputs of the network to set sequence_dimension exactly to 1. +2. [Reshape](ShapeInference.md) the CNNNetwork network if necessary. **Necessary case:** where the sequence_lengths dimension of input > 1, it means TensorIterator layer will have number_iterations > 1. We should reshape the inputs of the network to set sequence_dimension to exactly 1. Usually, the following exception, which occurs after applying a transform when trying to infer the network in a plugin, indicates the need to apply reshape feature: `C++ exception with description "Function is incorrect. Assign and ReadValue operations must be used in pairs in the network."` This means that there are several pairs of Assign/ReadValue operations with the same variable_id in the network, operations were inserted into each iteration of the TensorIterator. @@ -280,7 +277,7 @@ InferenceEngine::LowLatency(cnnNetwork); 4. Use state API. See sections [OpenVINO state API](#openvino-state-api), [Example of stateful network inference](#example-of-stateful-network-inference). -### Known limitations +### Known Limitations 1. Parameters connected directly to ReadValues (States) after the transformation is applied are not allowed. Unnecessary parameters may remain on the graph after applying the transformation. The automatic handling of this case inside the transformation is not possible now. Such Parameters should be removed manually from `ngraph::Function` or replaced with a Constant. diff --git a/docs/IE_DG/supported_plugins/GNA.md b/docs/IE_DG/supported_plugins/GNA.md index f47297571840a4..7503766fa42d89 100644 --- a/docs/IE_DG/supported_plugins/GNA.md +++ b/docs/IE_DG/supported_plugins/GNA.md @@ -73,7 +73,7 @@ Limitations include: #### Experimental Support for 2D Convolutions -The Intel® GNA hardware natively supports only 1D convolution. +The Intel® GNA hardware natively supports only 1D convolutions. However, 2D convolutions can be mapped to 1D when a convolution kernel moves in a single direction. GNA Plugin performs such a transformation for Kaldi `nnet1` convolution. From this perspective, the Intel® GNA hardware convolution operation accepts an `NHWC` input and produces an `NHWC` output. Because OpenVINO™ only supports the `NCHW` layout, you may need to insert `Permute` layers before or after convolutions. diff --git a/docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md b/docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md index 227ce101723283..c24faa3541f6b2 100644 --- a/docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md +++ b/docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md @@ -44,8 +44,8 @@ To request the internal context of the given `ExecutableNetwork`, use the `GetCo ## Shared Blob User-Side Wrappers -The classes that implement the `RemoteBlob` interface both are wrappers for native API -memory handles (which can be obtained from them at any moment) and act just like regular OpenVINO™ +The classes that implement the `RemoteBlob` interface are both wrappers for native API +memory handles (which can be obtained from them at any time) and act just like regular OpenVINO™ `Blob` objects. Once you obtain the context, you can use it to compile a new `ExecutableNetwork` or create `RemoteBlob` diff --git a/docs/IE_DG/supported_plugins/HDDL.md b/docs/IE_DG/supported_plugins/HDDL.md index 9154f1d3f3039a..5108e303594938 100644 --- a/docs/IE_DG/supported_plugins/HDDL.md +++ b/docs/IE_DG/supported_plugins/HDDL.md @@ -1,16 +1,12 @@ # HDDL Plugin {#openvino_docs_IE_DG_supported_plugins_HDDL} -## Introducing HDDL Plugin +## Introducing the HDDL Plugin -The Inference Engine HDDL plugin is developed for inference of neural networks on Intel® Vision Accelerator Design with Intel® Movidius™ VPUs which is designed for use cases those require large throughput of deep learning inference. It provides dozens amount of throughput as MYRIAD Plugin. +The Inference Engine HDDL plugin was developed for inference with neural networks on Intel® Vision Accelerator Design with Intel® Movidius™ VPUs. It is designed for use cases that require large throughput for deep learning inference, up to dozens of times more than the MYRIAD Plugin. -## Installation on Linux* OS +## Configuring the HDDL Plugin -For installation instructions, refer to the [Installation Guide for Linux\*](VPU.md). - -## Installation on Windows* OS - -For installation instructions, refer to the [Installation Guide for Windows\*](Supported_Devices.md). +To configure your Intel® Vision Accelerator Design With Intel® Movidius™ on supported OSs, refer to the Steps for Intel® Vision Accelerator Design with Intel® Movidius™ VPUs section in the installation guides for [Linux](../../install_guides/installing-openvino-linux.md) or [Windows](../../install_guides/installing-openvino-windows.md). ## Supported networks diff --git a/docs/IE_DG/supported_plugins/HETERO.md b/docs/IE_DG/supported_plugins/HETERO.md index 9b5f69ce687e95..f2b7521457e294 100644 --- a/docs/IE_DG/supported_plugins/HETERO.md +++ b/docs/IE_DG/supported_plugins/HETERO.md @@ -1,12 +1,12 @@ Heterogeneous Plugin {#openvino_docs_IE_DG_supported_plugins_HETERO} ======= -## Introducing Heterogeneous Plugin +## Introducing the Heterogeneous Plugin The heterogeneous plugin enables computing for inference on one network on several devices. -Purposes to execute networks in heterogeneous mode -* To utilize accelerators power and calculate heaviest parts of network on accelerator and execute not supported layers on fallback devices like CPU -* To utilize all available hardware more efficiently during one inference +The purposes of executing networks in heterogeneous mode: +* Utilize the power of accelerators to calculate heaviest parts of the network and execute unsupported layers on fallback devices like the CPU +* Utilize all available hardware more efficiently during one inference The execution through heterogeneous plugin can be divided to two independent steps: * Setting of affinity to layers @@ -14,14 +14,13 @@ The execution through heterogeneous plugin can be divided to two independent ste These steps are decoupled. The setting of affinity can be done automatically using fallback policy or in manual mode. -The fallback automatic policy means greedy behavior and assigns all layers which can be executed on certain device on that device follow priorities. -Automatic policy does not take into account such plugin peculiarities as inability to infer some layers without other special layers placed before of after that layers. It is plugin responsibility to solve such cases. If device plugin does not support subgraph topology constructed by Hetero plugin affinity should be set manually. +The fallback automatic policy causes "greedy" behavior and assigns all layers that can be executed on certain device according to the priorities you specify (for example, `HETERO:GPU,CPU`). +Automatic policy does not take into account plugin peculiarities such as the inability to infer some layers without other special layers placed before or after that layer. The plugin is responsible for solving such cases. If the device plugin does not support the subgraph topology constructed by the Hetero plugin, then you should set affinity manually. Some of the topologies are not friendly to heterogeneous execution on some devices or cannot be executed in such mode at all. -Example of such networks might be networks having activation layers which are not supported on primary device. -If transmitting of data from one part of network to another part in heterogeneous mode takes relatively much time, -then it is not much sense to execute them in heterogeneous mode on these devices. -In this case you can define heaviest part manually and set affinity thus way to avoid sending of data back and forth many times during one inference. +Examples of such networks are networks having activation layers which are not supported on primary device. +If transmitting data from one part of a network to another part in heterogeneous mode takes more time than in normal mode, it may not make sense to execute them in heterogeneous mode. +In this case, you can define heaviest part manually and set the affinity to avoid sending data back and forth many times during one inference. ## Annotation of Layers per Device and Default Fallback Policy Default fallback policy decides which layer goes to which device automatically according to the support in dedicated plugins (FPGA, GPU, CPU, MYRIAD). diff --git a/docs/IE_DG/supported_plugins/MULTI.md b/docs/IE_DG/supported_plugins/MULTI.md index f20443ca4c2006..ac161db91478ce 100644 --- a/docs/IE_DG/supported_plugins/MULTI.md +++ b/docs/IE_DG/supported_plugins/MULTI.md @@ -1,48 +1,44 @@ # Multi-Device Plugin {#openvino_docs_IE_DG_supported_plugins_MULTI} -## Introducing Multi-Device Execution +## Introducing the Multi-Device Plugin -Multi-Device plugin automatically assigns inference requests to available computational devices to execute the requests in parallel. -Potential gains are as follows +The Multi-Device plugin automatically assigns inference requests to available computational devices to execute the requests in parallel. Potential gains are as follows: * Improved throughput that multiple devices can deliver (compared to single-device execution) * More consistent performance, since the devices can now share the inference burden (so that if one device is becoming too busy, another device can take more of the load) -Notice that with multi-device the application logic left unchanged, so you don't need to explicitly load the network to every device, -create and balance the inference requests and so on. From the application point of view, this is just another device that handles the actual machinery. +Notice that with multi-device the application logic is left unchanged, so you don't need to explicitly load the network to every device, create and balance the inference requests and so on. From the application point of view, this is just another device that handles the actual machinery. The only thing that is required to leverage performance is to provide the multi-device (and hence the underlying devices) with enough inference requests to crunch. -For example if you were processing 4 cameras on the CPU (with 4 inference requests), you may now want to process more cameras (with more requests in flight) -to keep CPU+GPU busy via multi-device. +For example, if you were processing 4 cameras on the CPU (with 4 inference requests), you may now want to process more cameras (with more requests in flight) to keep CPU+GPU busy via multi-device. -The "setup" of multi-device can be described in three major steps: +The "setup" of Multi-Device can be described in three major steps: * First is configuration of each device as usual (e.g. via conventional SetConfig method) * Second is loading of a network to the Multi-Device plugin created on top of (prioritized) list of the configured devices. This is the only change that you need in your application. * Finally, just like with any other ExecutableNetwork (resulted from LoadNetwork) you just create as many requests as needed to saturate the devices. These steps are covered below in details. -## Defining and Configuring the Multi-Device -Following the OpenVINO notions of "devices", the multi-device has a "MULTI" name. -The only configuration option for the multi-device is prioritized list of devices to use: +## Defining and Configuring the Multi-Device plugin +Following the OpenVINO notions of "devices", the Multi-Device has a "MULTI" name. +The only configuration option for the Multi-Device plugin is a prioritized list of devices to use: | Parameter name | Parameter values | Default | Description | | :--- | :--- | :--- | :----------------------------------------------------------------------------------------------------------------------------| | "MULTI_DEVICE_PRIORITIES" | comma-separated device names with no spaces| N/A | Prioritized list of devices | -You can use name of the configuration directly as a string, or use MultiDeviceConfigParams::KEY_MULTI_DEVICE_PRIORITIES from the multi/multi_device_config.hpp that defines the same string. +You can use name of the configuration directly as a string, or use `MultiDeviceConfigParams::KEY_MULTI_DEVICE_PRIORITIES from the multi/multi_device_config.hpp`, which defines the same string. Basically, there are three ways to specify the devices to be use by the "MULTI": @snippet snippets/MULTI0.cpp part0 -Notice that the priorities of the devices can be changed in real-time for the executable network: +Notice that the priorities of the devices can be changed in real time for the executable network: @snippet snippets/MULTI1.cpp part1 -Finally, there is a way to specify number of requests that the multi-device will internally keep for each device. -Say if your original app was running 4 cameras with 4 inference requests now you would probably want to share these 4 requests between 2 devices used in the MULTI. The easiest way is to specify a number of requests for each device using parentheses: "MULTI:CPU(2),GPU(2)" and use the same 4 requests in your app. However, such an explicit configuration is not performance portable and hence not recommended. Instead, the better way is to configure the individual devices and query the resulting number of requests to be used in the application level (see [Configuring the Individual Devices and Creating the Multi-Device On Top](#configuring-the-individual-devices-and-creating-the-multi-device-on-top)). +Finally, there is a way to specify number of requests that the multi-device will internally keep for each device. Suppose your original app was running 4 cameras with 4 inference requests. You would probably want to share these 4 requests between 2 devices used in the MULTI. The easiest way is to specify a number of requests for each device using parentheses: "MULTI:CPU(2),GPU(2)" and use the same 4 requests in your app. However, such an explicit configuration is not performance-portable and hence not recommended. Instead, the better way is to configure the individual devices and query the resulting number of requests to be used at the application level (see [Configuring the Individual Devices and Creating the Multi-Device On Top](#configuring-the-individual-devices-and-creating-the-multi-device-on-top)). ## Enumerating Available Devices -Inference Engine now features a dedicated API to enumerate devices and their capabilities. See [Hello Query Device C++ Sample](../../../inference-engine/samples/hello_query_device/README.md). This is example output of the sample (truncated to the devices' names only): +Inference Engine now features a dedicated API to enumerate devices and their capabilities. See [Hello Query Device C++ Sample](../../../inference-engine/samples/hello_query_device/README.md). This is example output from the sample (truncated to the devices' names only): ```sh ./hello_query_device @@ -55,12 +51,12 @@ Available devices: ... Device: HDDL ``` -Simple programmatic way to enumerate the devices and use with the multi-device is as follows: +A simple programmatic way to enumerate the devices and use with the multi-device is as follows: @snippet snippets/MULTI2.cpp part2 -Beyond trivial "CPU", "GPU", "HDDL" and so on, when multiple instances of a device are available the names are more qualified. -For example this is how two Intel® Movidius™ Myriad™ X sticks are listed with the hello_query_sample: +Beyond the trivial "CPU", "GPU", "HDDL" and so on, when multiple instances of a device are available the names are more qualified. +For example, this is how two Intel® Movidius™ Myriad™ X sticks are listed with the hello_query_sample: ``` ... Device: MYRIAD.1.2-ma2480 @@ -78,7 +74,7 @@ As discussed in the first section, you shall configure each individual device as @snippet snippets/MULTI4.cpp part4 -Alternatively, you can combine all the individual device settings into single config and load that, allowing the multi-device plugin to parse and apply that to the right devices. See code example in the next section. +Alternatively, you can combine all the individual device settings into single config and load that, allowing the Multi-Device plugin to parse and apply that to the right devices. See code example in the next section. Notice that while the performance of accelerators combines really well with multi-device, the CPU+GPU execution poses some performance caveats, as these devices share the power, bandwidth and other resources. For example it is recommended to enable the GPU throttling hint (which save another CPU thread for the CPU inference). See section of the [Using the multi-device with OpenVINO samples and benchmarking the performance](#using-the-multi-device-with-openvino-samples-and-benchmarking-the-performance) below. diff --git a/docs/IE_DG/supported_plugins/MYRIAD.md b/docs/IE_DG/supported_plugins/MYRIAD.md index df9c68da5035a8..8983f20a92535b 100644 --- a/docs/IE_DG/supported_plugins/MYRIAD.md +++ b/docs/IE_DG/supported_plugins/MYRIAD.md @@ -2,7 +2,7 @@ ## Introducing MYRIAD Plugin -The Inference Engine MYRIAD plugin is developed for inference of neural networks on Intel® Neural Compute Stick 2. +The Inference Engine MYRIAD plugin has been developed for inference of neural networks on Intel® Neural Compute Stick 2. ## Installation on Linux* OS diff --git a/docs/IE_DG/supported_plugins/Supported_Devices.md b/docs/IE_DG/supported_plugins/Supported_Devices.md index 514b4bd58a70a7..ed8cabec076f03 100644 --- a/docs/IE_DG/supported_plugins/Supported_Devices.md +++ b/docs/IE_DG/supported_plugins/Supported_Devices.md @@ -21,7 +21,7 @@ Devices similar to the ones we have used for benchmarking can be accessed using ## Supported Configurations The Inference Engine can inference models in different formats with various input and output formats. -This chapter provides supported and optimal configurations for each plugin. +This page shows supported and optimal configurations for each plugin. ### Terminology @@ -36,17 +36,19 @@ This chapter provides supported and optimal configurations for each plugin. | U16 format | 2-byte unsigned integer format | | U8 format | 1-byte unsigned integer format | -NHWC, NCHW - Image data layout. Refers to the representation of batches of images. -NCDHW - Images sequence data layout. +NHWC, NCHW, and NCDHW refer to the representation of batches of images. +* NHWC and NCHW refer to image data layout. +* NCDHW refers to image sequence data layout. -* N - Number of images in a batch -* D - Depth. Depend on model it could be spatial or time dimension -* H - Number of pixels in the vertical dimension -* W - Number of pixels in the horizontal dimension -* C - Number of channels +Abbreviations in the support tables are as follows: +* N: Number of images in a batch +* D: Depth. Depend on model it could be spatial or time dimension +* H: Number of pixels in the vertical dimension +* W: Number of pixels in the horizontal dimension +* C: Number of channels CHW, NC, C - Tensor memory layout. -For example, the CHW value at index (c,h,w) is physically located at index (c\*H+h)\*W+w, for others by analogy +For example, the CHW value at index (c,h,w) is physically located at index (c\*H+h)\*W+w, for others by analogy. ### Supported Model Formats diff --git a/docs/IE_DG/supported_plugins/VPU.md b/docs/IE_DG/supported_plugins/VPU.md index 189a23b5a94f20..40cf1ad543bc61 100644 --- a/docs/IE_DG/supported_plugins/VPU.md +++ b/docs/IE_DG/supported_plugins/VPU.md @@ -45,7 +45,7 @@ Certain layers can be merged into Convolution, ReLU, and Eltwise layers accordin > **NOTE**: Application of these rules depends on tensor sizes and resources available. -Layers can be joined when the two conditions below are met: +Layers can be joined only when the two conditions below are met: - Layers are located on topologically independent branches. - Layers can be executed simultaneously on the same hardware units. diff --git a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md index d21ab41bd5efd5..2aed66ba719934 100644 --- a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md +++ b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md @@ -68,7 +68,7 @@ Model Optimizer produces an Intermediate Representation (IR) of the network, whi > **NOTE:** > [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019). -## Table of Content +## Table of Contents * [Preparing and Optimizing your Trained Model with Model Optimizer](prepare_model/Prepare_Trained_Model.md) * [Configuring Model Optimizer](prepare_model/Config_Model_Optimizer.md) diff --git a/docs/MO_DG/IR_and_opsets.md b/docs/MO_DG/IR_and_opsets.md index e6a36b3009cdf4..e7cb01f1fc5e19 100644 --- a/docs/MO_DG/IR_and_opsets.md +++ b/docs/MO_DG/IR_and_opsets.md @@ -36,12 +36,12 @@ OpenVINO™ toolkit introduces its own format of graph representation and its ow A graph is represented with two files: an XML file and a binary file. This representation is commonly referred to as the *Intermediate Representation* or *IR*. -XML file describes a network topology using `` tag for an operation node and `` tag is for a data-flow connection. +The XML file describes a network topology using a `` tag for an operation node and an `` tag for a data-flow connection. Each operation has a fixed number of attributes that define operation flavor used for a node. For example, `Convolution` operation has such attributes as `dilation`, `stride`, `pads_begin` and `pads_end`. -XML file doesn't have big constant values, like convolution weights. -Instead, it refers to a part of accompanying binary file that stores such values in a binary format. +The XML file doesn't have big constant values, like convolution weights. +Instead, it refers to a part of the accompanying binary file that stores such values in a binary format. Here is an example of a small IR XML file that corresponds to a graph from the previous section: diff --git a/docs/MO_DG/Known_Issues_Limitations.md b/docs/MO_DG/Known_Issues_Limitations.md index 075cbc6e7c333b..ec8897d06c6250 100644 --- a/docs/MO_DG/Known_Issues_Limitations.md +++ b/docs/MO_DG/Known_Issues_Limitations.md @@ -25,7 +25,7 @@ Possible workarounds: LD_PRELOAD= ``` This eliminates multiple loadings of libiomp, and makes all the components use this specific version of OpenMP. -* Alternatively, you can set KMP_DUPLICATE_LIB_OK=TRUE. However, performance degradation or results incorrectness may occur in this case. +* Alternatively, you can set KMP_DUPLICATE_LIB_OK=TRUE. However, performance degradation or incorrect results may occur in this case. ## Old proto compiler breaks protobuf library diff --git a/docs/MO_DG/prepare_model/Config_Model_Optimizer.md b/docs/MO_DG/prepare_model/Config_Model_Optimizer.md index b5b9853b35cb64..cced8949e54b3d 100644 --- a/docs/MO_DG/prepare_model/Config_Model_Optimizer.md +++ b/docs/MO_DG/prepare_model/Config_Model_Optimizer.md @@ -156,7 +156,7 @@ pip3 install -r requirements_onnx.txt These procedures require: * Access to GitHub and the ability to use git commands -* Microsoft Visual Studio\* 2013 for Win64\* +* Microsoft Visual Studio\* 2013 for Win64\* (if using Windows\*) * C/C++ Model Optimizer uses the protobuf library to load trained Caffe models. diff --git a/docs/MO_DG/prepare_model/Model_Optimization_Techniques.md b/docs/MO_DG/prepare_model/Model_Optimization_Techniques.md index 60e61207f24d75..f2ae32a69242d3 100644 --- a/docs/MO_DG/prepare_model/Model_Optimization_Techniques.md +++ b/docs/MO_DG/prepare_model/Model_Optimization_Techniques.md @@ -6,7 +6,7 @@ Optimization offers methods to accelerate inference with the convolution neural ## Linear Operations Fusing -Many convolution neural networks includes `BatchNormalization` and `ScaleShift` layers (for example, Resnet\*, Inception\*) that can be presented as a sequence of linear operations: additions and multiplications. For example ScaleShift layer can be presented as Mul → Add sequence. These layers can be fused into previous `Convolution` or `FullyConnected` layers, except that case when Convolution comes after Add operation (due to Convolution paddings). +Many convolution neural networks includes `BatchNormalization` and `ScaleShift` layers (for example, Resnet\*, Inception\*) that can be presented as a sequence of linear operations: additions and multiplications. For example ScaleShift layer can be presented as Mul → Add sequence. These layers can be fused into previous `Convolution` or `FullyConnected` layers, except when Convolution comes after an Add operation (due to Convolution paddings). ### Usage @@ -16,11 +16,11 @@ In the Model Optimizer, this optimization is turned on by default. To disable it This optimization method consists of three stages: -1. `BatchNormalization` and `ScaleShift` decomposition: on this stage, `BatchNormalization` layer is decomposed to `Mul → Add → Mul → Add` sequence, and `ScaleShift` layer is decomposed to `Mul → Add` layers sequence. +1. `BatchNormalization` and `ScaleShift` decomposition: in this stage, `BatchNormalization` layer is decomposed to `Mul → Add → Mul → Add` sequence, and `ScaleShift` layer is decomposed to `Mul → Add` layers sequence. -2. **Linear operations merge**: on this stage we merge sequences of `Mul` and `Add` operations to the single `Mul → Add` instance. - For example, if we have `BatchNormalization → ScaleShift` sequence in our topology, it is replaced with `Mul → Add` (by the first stage). On the next stage, the latter will be replaced with `ScaleShift` layer in case if we have no available `Convolution` or `FullyConnected` layer to fuse into (next). -3. **Linear operations fusion**: on this stage, the tool fuses `Mul` and `Add` operations to `Convolution` or `FullyConnected` layers. Notice that it searches for `Convolution` and `FullyConnected` layers both backward and forward in the graph (except for `Add` operation that cannot be fused to `Convolution` layer in forward direction). +2. **Linear operations merge**: in this stage, the `Mul` and `Add` operations are merged into a single `Mul → Add` instance. + For example, if there is a `BatchNormalization → ScaleShift` sequence in the topology, it is replaced with `Mul → Add` in the first stage. In the next stage, the latter is replaced with a `ScaleShift` layer if there is no available `Convolution` or `FullyConnected` layer to fuse into next. +3. **Linear operations fusion**: in this stage, the tool fuses `Mul` and `Add` operations to `Convolution` or `FullyConnected` layers. Notice that it searches for `Convolution` and `FullyConnected` layers both backward and forward in the graph (except for `Add` operation that cannot be fused to `Convolution` layer in forward direction). ### Usage Examples @@ -36,11 +36,11 @@ ResNet optimization is a specific optimization that applies to Caffe ResNet topo ### Optimization Description -On the picture below, you can see the original and optimized parts of a Caffe ResNet50 model. The main idea of this optimization is to move the stride that is greater than 1 from Convolution layers with the kernel size = 1 to upper Convolution layers. In addition, the Model Optimizer adds a Pooling layer to align the input shape for a Eltwise layer, if it was changed during the optimization. +In the picture below, you can see the original and optimized parts of a Caffe ResNet50 model. The main idea of this optimization is to move the stride that is greater than 1 from Convolution layers with the kernel size = 1 to upper Convolution layers. In addition, the Model Optimizer adds a Pooling layer to align the input shape for a Eltwise layer, if it was changed during the optimization. ![ResNet50 blocks (original and optimized) from Netscope*](../img/optimizations/resnet_optimization.png) -In this example, the stride from the `res3a_branch1` and `res3a_branch2a` Convolution layers moves to the `res2c_branch2b` Convolution layer. Also to align the input shape for `res2c` Eltwise, the optimization inserts the Pooling layer with kernel size = 1 and stride = 2. +In this example, the stride from the `res3a_branch1` and `res3a_branch2a` Convolution layers moves to the `res2c_branch2b` Convolution layer. In addition, to align the input shape for `res2c` Eltwise, the optimization inserts the Pooling layer with kernel size = 1 and stride = 2. * * * diff --git a/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md b/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md index f04d413bd1aab1..f9aef04a0a9561 100644 --- a/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md +++ b/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md @@ -174,7 +174,7 @@ Model Optimizer tried to infer a specified layer via the Caffe\* framework, howe #### 13. What does the message "Cannot infer shapes due to exception in Caffe" mean? -Model Optimizer tried to infer a custom layer via the Caffe\* framework, however an error occurred, meaning that the model could not be inferred using the Caffe. It might happen if you try to convert the model with some noise weights and biases resulting in problems with layers with dynamic shapes. You should write your own extension for every custom layer you topology might have. For more details, refer to [Extending Model Optimizer with New Primitives](customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md). +Model Optimizer tried to infer a custom layer via the Caffe\* framework, but an error occurred, meaning that the model could not be inferred using Caffe. This might happen if you try to convert the model with some noise weights and biases that result in problems with layers that have dynamic shapes. You should write your own extension for every custom layer you topology might have. For more details, refer to [Model Optimizer Extensibility](customize_model_optimizer/Customize_Model_Optimizer.md). #### 14. What does the message "Cannot infer shape for node {} because there is no Caffe available. Please register python infer function for op or use Caffe for shape inference" mean? @@ -200,7 +200,7 @@ You might have specified negative values with `--mean_file_offsets`. Only positi `--scale` sets a scaling factor for all channels. `--scale_values` sets a scaling factor per each channel. Using both of them simultaneously produces ambiguity, so you must use only one of them. For more information, refer to the Using Framework-Agnostic Conversion Parameters: for Converting a Caffe* Model, Converting a TensorFlow* Model, Converting an MXNet* Model. -#### 20. What does the message "Cannot find prototxt file: for Caffe please specify --input_proto - a protobuf file that stores topology and --input_model that stores pretrained weights" mean? +#### 20. What does the message "Cannot find prototxt file: for Caffe please specify --input_proto - a protobuf file that stores topology and --input_model that stores pre-trained weights" mean? Model Optimizer cannot find a `.prototxt` file for a specified model. By default, it must be located in the same directory as the input model with the same name (except extension). If any of these conditions is not satisfied, use `--input_proto` to specify the path to the `.prototxt` file. @@ -258,7 +258,7 @@ This error occurs when the `SubgraphMatch._add_output_node` function is called m #### 35. What does the message "Unsupported match kind.... Match kinds "points" or "scope" are supported only" mean? -While using configuration file to implement a TensorFlow\* front replacement extension, an incorrect match kind was used. Only `points` or `scope` match kinds are supported. Please, refer to [Sub-Graph Replacement in the Model Optimizer](customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md) for more details. +While using configuration file to implement a TensorFlow\* front replacement extension, an incorrect match kind was used. Only `points` or `scope` match kinds are supported. Please refer to [Model Optimizer Extensibility](customize_model_optimizer/Customize_Model_Optimizer.md) for more details. #### 36. What does the message "Cannot write an event file for the TensorBoard to directory" mean? diff --git a/docs/MO_DG/prepare_model/Prepare_Trained_Model.md b/docs/MO_DG/prepare_model/Prepare_Trained_Model.md index f0dca5283f8b0e..a74d1b789a2f34 100644 --- a/docs/MO_DG/prepare_model/Prepare_Trained_Model.md +++ b/docs/MO_DG/prepare_model/Prepare_Trained_Model.md @@ -25,7 +25,7 @@ However, if you use a topology with layers that are not recognized by the Model ## Model Optimizer Directory Structure -After installation with OpenVINO™ toolkit or Intel® Deep Learning Deployment Toolkit, the Model Optimizer folder has the following structure: +After installation with OpenVINO™ toolkit or Intel® Deep Learning Deployment Toolkit, the Model Optimizer folder has the following structure (some directories omitted for clarity): ``` |-- model_optimizer |-- extensions diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md index 06ae438d9cd3c6..4c257d1689ea23 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md @@ -38,10 +38,10 @@ A summary of the steps for optimizing and deploying a model that was trained wit To convert a Caffe\* model: -1. Go to the `/deployment_tools/model_optimizer` directory. -2. Use the `mo.py` script to simply convert a model with the path to the input model `.caffemodel` file: +1. Go to the `$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer` directory. +2. Use the `mo.py` script to simply convert a model, specifying the path to the input model `.caffemodel` file and the path to an output directory with write permissions: ```sh -python3 mo.py --input_model .caffemodel +python3 mo.py --input_model .caffemodel --output_dir ``` Two groups of parameters are available to convert your model: @@ -91,15 +91,16 @@ Caffe*-specific parameters: #### Command-Line Interface (CLI) Examples Using Caffe\*-Specific Parameters -* Launching the Model Optimizer for the [bvlc_alexnet.caffemodel](https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet) with a specified `prototxt` file. This is needed when the name of the Caffe\* model and the `.prototxt` file are different or are placed in different directories. Otherwise, it is enough to provide only the path to the input `model.caffemodel` file. +* Launching the Model Optimizer for the [bvlc_alexnet.caffemodel](https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet) with a specified `prototxt` file. This is needed when the name of the Caffe\* model and the `.prototxt` file are different or are placed in different directories. Otherwise, it is enough to provide only the path to the input `model.caffemodel` file. You must have write permissions for the output directory. + ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt +python3 mo.py --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt --output_dir ``` -* Launching the Model Optimizer for the [bvlc_alexnet.caffemodel](https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet) with a specified `CustomLayersMapping` file. This is the legacy method of quickly enabling model conversion if your model has custom layers. This requires system Caffe\* on the computer. To read more about this, see [Legacy Mode for Caffe* Custom Layers](../customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md). +* Launching the Model Optimizer for the [bvlc_alexnet.caffemodel](https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet) with a specified `CustomLayersMapping` file. This is the legacy method of quickly enabling model conversion if your model has custom layers. This requires the Caffe\* system on the computer. To read more about this, see [Legacy Mode for Caffe* Custom Layers](../customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md). Optional parameters without default values and not specified by the user in the `.prototxt` file are removed from the Intermediate Representation, and nested parameters are flattened: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params +python3 mo.py --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params --output_dir ``` This example shows a multi-input model with input layers: `data`, `rois` ``` @@ -121,9 +122,9 @@ layer { } ``` -* Launching the Model Optimizer for a multi-input model with two inputs and providing a new shape for each input in the order they are passed to the Model Optimizer. In particular, for data, set the shape to `1,3,227,227`. For rois, set the shape to `1,6,1,1`: +* Launching the Model Optimizer for a multi-input model with two inputs and providing a new shape for each input in the order they are passed to the Model Optimizer along with a writable output directory. In particular, for data, set the shape to `1,3,227,227`. For rois, set the shape to `1,6,1,1`: ```sh -python3 mo.py --input_model /path-to/your-model.caffemodel --input data,rois --input_shape (1,3,227,227),[1,6,1,1] +python3 mo.py --input_model /path-to/your-model.caffemodel --input data,rois --input_shape (1,3,227,227),[1,6,1,1] --output_dir ``` ## Custom Layer Definition diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md index 98bf2b78c3ae42..23fbad2ee08e13 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md @@ -34,9 +34,9 @@ A summary of the steps for optimizing and deploying a model that was trained wit To convert a Kaldi\* model: 1. Go to the `/deployment_tools/model_optimizer` directory. -2. Use the `mo.py` script to simply convert a model with the path to the input model `.nnet` or `.mdl` file: +2. Use the `mo.py` script to simply convert a model with the path to the input model `.nnet` or `.mdl` file and to an output directory where you have write permissions: ```sh -python3 mo.py --input_model .nnet +python3 mo.py --input_model .nnet --output_dir ``` Two groups of parameters are available to convert your model: @@ -58,14 +58,14 @@ Kaldi-specific parameters: ### Examples of CLI Commands -* To launch the Model Optimizer for the wsj_dnn5b_smbr model with the specified `.nnet` file: +* To launch the Model Optimizer for the wsj_dnn5b_smbr model with the specified `.nnet` file and an output directory where you have write permissions: ```sh -python3 mo.py --input_model wsj_dnn5b_smbr.nnet +python3 mo.py --input_model wsj_dnn5b_smbr.nnet --output_dir ``` -* To launch the Model Optimizer for the wsj_dnn5b_smbr model with existing file that contains counts for the last layer with biases: +* To launch the Model Optimizer for the wsj_dnn5b_smbr model with existing file that contains counts for the last layer with biases and a writable output directory: ```sh -python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts +python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --output_dir _ ``` * The Model Optimizer normalizes сounts in the following way: \f[ @@ -81,7 +81,7 @@ python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts * If you want to remove the last SoftMax layer in the topology, launch the Model Optimizer with the `--remove_output_softmax` flag. ```sh -python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --remove_output_softmax +python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --remove_output_softmax --output_dir _ ``` The Model Optimizer finds the last layer of the topology and removes this layer only if it is a SoftMax layer. diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md index 32b0c5fe95a05a..4b8c1816e8b318 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md @@ -1,4 +1,4 @@ -# Converting a MXNet* Model {#openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_MxNet} +# Converting an MXNet* Model {#openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_MxNet} A summary of the steps for optimizing and deploying a model that was trained with the MXNet\* framework: @@ -46,9 +46,9 @@ A summary of the steps for optimizing and deploying a model that was trained wit To convert an MXNet\* model: 1. Go to the `/deployment_tools/model_optimizer` directory. -2. To convert an MXNet\* model contained in a `model-file-symbol.json` and `model-file-0000.params`, run the Model Optimizer launch script `mo.py`, specifying a path to the input model file: +2. To convert an MXNet\* model contained in a `model-file-symbol.json` and `model-file-0000.params`, run the Model Optimizer launch script `mo.py`, specifying a path to the input model file and a path to an output directory with write permissions: ```sh -python3 mo_mxnet.py --input_model model-file-0000.params +python3 mo_mxnet.py --input_model model-file-0000.params --output_dir ``` Two groups of parameters are available to convert your model: @@ -67,7 +67,7 @@ MXNet-specific parameters: --nd_prefix_name Prefix name for args.nd and argx.nd files --pretrained_model_name - Name of a pretrained MXNet model without extension and epoch + Name of a pre-trained MXNet model without extension and epoch number. This model will be merged with args.nd and argx.nd files --save_params_from_nd diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md index 561a7f84cf85a3..79f740b55ecdd4 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md @@ -25,7 +25,7 @@ | GPT-2 | [model archive](https://github.com/onnx/models/blob/master/text/machine_comprehension/gpt-2/model/gpt2-10.tar.gz) | | YOLOv3 | [model archive](https://github.com/onnx/models/blob/master/vision/object_detection_segmentation/yolov3/model/yolov3-10.tar.gz) | -Listed models are built with the operation set version 8 except the GPT-2 model. Models that are upgraded to higher operation set versions may not be supported. +Listed models are built with the operation set version 8 except the GPT-2 model (which uses version 10). Models that are upgraded to higher operation set versions may not be supported. ## Supported PaddlePaddle* Models via ONNX Conversion Starting from the R5 release, the OpenVINO™ toolkit officially supports public PaddlePaddle* models via ONNX conversion. @@ -60,9 +60,9 @@ The Model Optimizer process assumes you have an ONNX model that was directly dow To convert an ONNX\* model: 1. Go to the `/deployment_tools/model_optimizer` directory. -2. Use the `mo.py` script to simply convert a model with the path to the input model `.nnet` file: +2. Use the `mo.py` script to simply convert a model with the path to the input model `.nnet` file and an output directory where you have write permissions: ```sh -python3 mo.py --input_model .onnx +python3 mo.py --input_model .onnx --output_dir ``` There are no ONNX\* specific parameters, so only [framework-agnostic parameters](Converting_Model_General.md) are available to convert your model. diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md index 275b8e786d0b09..c4721cdead07ee 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md @@ -35,7 +35,7 @@ Detailed information on how to convert models from the TensorFlow 1 Detection Model Zoo is available in the [Converting TensorFlow Object Detection API Models](tf_specific/Convert_Object_Detection_API_Models.md) chapter. The table below contains models from the Object Detection Models zoo that are supported. @@ -68,7 +68,7 @@ Detailed information on how to convert models from the TensorFlow 2 Detection Model Zoo is available in the [Converting TensorFlow Object Detection API Models](tf_specific/Convert_Object_Detection_API_Models.md) chapter. The table below contains models from the Object Detection Models zoo that are supported. @@ -154,13 +154,13 @@ Where `HEIGHT` and `WIDTH` are the input images height and width for which the m | YOLOv4 | [Repo](https://github.com/Ma-Dan/keras-yolo4) | | STN | [Repo](https://github.com/oarriaga/STN.keras) | -* YOLO topologies from DarkNet* can be converted using [instruction](tf_specific/Convert_YOLO_From_Tensorflow.md), -* FaceNet topologies can be converted using [instruction](tf_specific/Convert_FaceNet_From_Tensorflow.md). -* CRNN topologies can be converted using [instruction](tf_specific/Convert_CRNN_From_Tensorflow.md). -* NCF topologies can be converted using [instruction](tf_specific/Convert_NCF_From_Tensorflow.md) -* [GNMT](https://github.com/tensorflow/nmt) topology can be converted using [instruction](tf_specific/Convert_GNMT_From_Tensorflow.md) -* [BERT](https://github.com/google-research/bert) topology can be converted using [this instruction](tf_specific/Convert_BERT_From_Tensorflow.md). -* [XLNet](https://github.com/zihangdai/xlnet) topology can be converted using [this instruction](tf_specific/Convert_XLNet_From_Tensorflow.md). +* YOLO topologies from DarkNet* can be converted using [these instructions](tf_specific/Convert_YOLO_From_Tensorflow.md). +* FaceNet topologies can be converted using [these instructions](tf_specific/Convert_FaceNet_From_Tensorflow.md). +* CRNN topologies can be converted using [these instructions](tf_specific/Convert_CRNN_From_Tensorflow.md). +* NCF topologies can be converted using [these instructions](tf_specific/Convert_NCF_From_Tensorflow.md). +* [GNMT](https://github.com/tensorflow/nmt) topology can be converted using [these instructions](tf_specific/Convert_GNMT_From_Tensorflow.md). +* [BERT](https://github.com/google-research/bert) topology can be converted using [these instructions](tf_specific/Convert_BERT_From_Tensorflow.md). +* [XLNet](https://github.com/zihangdai/xlnet) topology can be converted using [these instructions](tf_specific/Convert_XLNet_From_Tensorflow.md). @@ -176,18 +176,18 @@ There are three ways to store non-frozen TensorFlow models and load them to the If you do not have an inference graph file, refer to [Freezing Custom Models in Python](#freeze-the-tensorflow-model). - To convert such TensorFlow model: + To convert such a TensorFlow model: 1. Go to the `/deployment_tools/model_optimizer` directory - 2. Run the `mo_tf.py` script with the path to the checkpoint file to convert a model: + 2. Run the `mo_tf.py` script with the path to the checkpoint file to convert a model and an output directory where you have write permissions: * If input model is in `.pb` format:
```sh -python3 mo_tf.py --input_model .pb --input_checkpoint +python3 mo_tf.py --input_model .pb --input_checkpoint --output_dir ``` * If input model is in `.pbtxt` format:
```sh -python3 mo_tf.py --input_model .pbtxt --input_checkpoint --input_model_is_text +python3 mo_tf.py --input_model .pbtxt --input_checkpoint --input_model_is_text --output_dir ``` 2. MetaGraph: @@ -201,9 +201,9 @@ python3 mo_tf.py --input_model .pbtxt --input_checkpoint /deployment_tools/model_optimizer` directory - 2. Run the `mo_tf.py` script with a path to the MetaGraph `.meta` file to convert a model:
+ 2. Run the `mo_tf.py` script with a path to the MetaGraph `.meta` file and a writable output directory to convert a model:
```sh -python3 mo_tf.py --input_meta_graph .meta +python3 mo_tf.py --input_meta_graph .meta --output_dir ``` 3. SavedModel format of TensorFlow 1.x and 2.x versions: @@ -213,9 +213,9 @@ python3 mo_tf.py --input_meta_graph .meta To convert such TensorFlow model: 1. Go to the `/deployment_tools/model_optimizer` directory - 2. Run the `mo_tf.py` script with a path to the SavedModel directory to convert a model:
+ 2. Run the `mo_tf.py` script with a path to the SavedModel directory and a writable output directory to convert a model:
```sh -python3 mo_tf.py --saved_model_dir +python3 mo_tf.py --saved_model_dir --output_dir ``` You can convert TensorFlow 1.x SavedModel format in the environment that has a 1.x or 2.x version of TensorFlow. However, TensorFlow 2.x SavedModel format strictly requires the 2.x version of TensorFlow. @@ -252,9 +252,9 @@ Where: To convert a TensorFlow model: 1. Go to the `/deployment_tools/model_optimizer` directory -2. Use the `mo_tf.py` script to simply convert a model with the path to the input model `.pb` file: +2. Use the `mo_tf.py` script to simply convert a model with the path to the input model `.pb` file and a writable output directory: ```sh -python3 mo_tf.py --input_model .pb +python3 mo_tf.py --input_model .pb --output_dir ``` Two groups of parameters are available to convert your model: @@ -306,29 +306,29 @@ TensorFlow*-specific parameters: #### Command-Line Interface (CLI) Examples Using TensorFlow\*-Specific Parameters -* Launching the Model Optimizer for Inception V1 frozen model when model file is a plain text protobuf: +* Launching the Model Optimizer for Inception V1 frozen model when model file is a plain text protobuf, specifying a writable output directory: ```sh -python3 mo_tf.py --input_model inception_v1.pbtxt --input_model_is_text -b 1 +python3 mo_tf.py --input_model inception_v1.pbtxt --input_model_is_text -b 1 --output_dir ``` -* Launching the Model Optimizer for Inception V1 frozen model and update custom sub-graph replacement file `transform.json` with information about input and output nodes of the matched sub-graph. For more information about this feature, refer to [Sub-Graph Replacement in the Model Optimizer](../customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md). +* Launching the Model Optimizer for Inception V1 frozen model and update custom sub-graph replacement file `transform.json` with information about input and output nodes of the matched sub-graph, specifying a writable output directory. For more information about this feature, refer to [Sub-Graph Replacement in the Model Optimizer](../customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md). ```sh -python3 mo_tf.py --input_model inception_v1.pb -b 1 --tensorflow_custom_operations_config_update transform.json +python3 mo_tf.py --input_model inception_v1.pb -b 1 --tensorflow_custom_operations_config_update transform.json --output_dir ``` * Launching the Model Optimizer for Inception V1 frozen model and use custom sub-graph replacement file `transform.json` for model conversion. For more information about this feature, refer to [Sub-Graph Replacement in the Model Optimizer](../customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md). ```sh -python3 mo_tf.py --input_model inception_v1.pb -b 1 --transformations_config transform.json +python3 mo_tf.py --input_model inception_v1.pb -b 1 --transformations_config transform.json --output_dir ``` * Launching the Model Optimizer for Inception V1 frozen model and dump information about the graph to TensorBoard log dir `/tmp/log_dir` ```sh -python3 mo_tf.py --input_model inception_v1.pb -b 1 --tensorboard_logdir /tmp/log_dir +python3 mo_tf.py --input_model inception_v1.pb -b 1 --tensorboard_logdir /tmp/log_dir --output_dir ``` * Launching the Model Optimizer for a model with custom TensorFlow operations (refer to the [TensorFlow* documentation](https://www.tensorflow.org/extend/adding_an_op)) implemented in C++ and compiled into the shared library `my_custom_op.so`. Model Optimizer falls back to TensorFlow to infer output shape of operations implemented in the library if a custom TensorFlow operation library is provided. If it is not provided, a custom operation with an inference function is needed. For more information about custom operations, refer to the [Extending the Model Optimizer with New Primitives](../customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md). ```sh -python3 mo_tf.py --input_model custom_model.pb --tensorflow_custom_layer_libraries ./my_custom_op.so +python3 mo_tf.py --input_model custom_model.pb --tensorflow_custom_layer_libraries ./my_custom_op.so --output_dir ``` @@ -343,9 +343,9 @@ Below are the instructions on how to convert each of them. A model in the SavedModel format consists of a directory with a `saved_model.pb` file and two subfolders: `variables` and `assets`. To convert such a model: 1. Go to the `/deployment_tools/model_optimizer` directory. -2. Run the `mo_tf.py` script with a path to the SavedModel directory: +2. Run the `mo_tf.py` script with a path to the SavedModel directory and a writable output directory: ```sh -python3 mo_tf.py --saved_model_dir +python3 mo_tf.py --saved_model_dir --output_dir ``` TensorFlow* 2 SavedModel format strictly requires the 2.x version of TensorFlow installed in the diff --git a/docs/MO_DG/prepare_model/convert_model/Converting_Model.md b/docs/MO_DG/prepare_model/convert_model/Converting_Model.md index 2df7773b8ad57d..ed6451a76322d6 100644 --- a/docs/MO_DG/prepare_model/convert_model/Converting_Model.md +++ b/docs/MO_DG/prepare_model/convert_model/Converting_Model.md @@ -1,9 +1,9 @@ # Converting a Model to Intermediate Representation (IR) {#openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model} Use the mo.py script from the `/deployment_tools/model_optimizer` directory to run the Model Optimizer and convert the model to the Intermediate Representation (IR). -The simplest way to convert a model is to run mo.py with a path to the input model file: +The simplest way to convert a model is to run mo.py with a path to the input model file and an output directory where you have write permissions: ```sh -python3 mo.py --input_model INPUT_MODEL +python3 mo.py --input_model INPUT_MODEL --output_dir ``` > **NOTE**: Some models require using additional arguments to specify conversion parameters, such as `--scale`, `--scale_values`, `--mean_values`, `--mean_file`. To learn about when you need to use these parameters, refer to [Converting a Model Using General Conversion Parameters](Converting_Model_General.md). diff --git a/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md b/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md index 82bcd133bb815f..2d267cda3e7172 100644 --- a/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md +++ b/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md @@ -1,11 +1,12 @@ # Converting a Model Using General Conversion Parameters {#openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model_General} -To simply convert a model trained by any supported framework, run the Model Optimizer launch script ``mo.py`` with -specifying a path to the input model file: +To simply convert a model trained by any supported framework, run the Model Optimizer launch script ``mo.py`` specifying a path to the input model file and an output directory where you have write permissions: ```sh -python3 mo.py --input_model INPUT_MODEL +python3 mo.py --input_model INPUT_MODEL --output_dir ``` +The script is in `$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/`. The output directory must have write permissions, so you can run mo.py from the output directory or specify an output path with the `--output_dir` option. + > **NOTE:** The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the `RGB<->BGR` conversion specifying the command-line parameter: `--reverse_input_channels`. Otherwise, inference results may be incorrect. For details, refer to [When to Reverse Input Channels](#when_to_reverse_input_channels). To adjust the conversion process, you can also use the general (framework-agnostic) parameters: @@ -157,7 +158,7 @@ If both mean and scale values are specified, the mean is subtracted first and th There is no a universal recipe for determining the mean/scale values for a particular model. The steps below could help to determine them: * Read the model documentation. Usually the documentation describes mean/scale value if the pre-processing is required. * Open the example script/application executing the model and track how the input data is read and passed to the framework. -* Open the model in a visualization tool and check for layers performing subtraction or multiplication (like `Sub`, `Mul`, `ScaleShift`, `Eltwise` etc) of the input data. If such layers exist, the pre-processing is most probably the part of the model. +* Open the model in a visualization tool and check for layers performing subtraction or multiplication (like `Sub`, `Mul`, `ScaleShift`, `Eltwise` etc) of the input data. If such layers exist, pre-processing is probably part of the model. ## When to Specify Input Shapes There are situations when the input data shape for the model is not fixed, like for the fully-convolutional neural networks. In this case, for example, TensorFlow\* models contain `-1` values in the `shape` attribute of the `Placeholder` operation. Inference Engine does not support input layers with undefined size, so if the input shapes are not defined in the model, the Model Optimizer fails to convert the model. The solution is to provide the input shape(s) using the `--input` or `--input_shape` command line parameter for all input(s) of the model or provide the batch size using the `-b` command line parameter if the model contains just one input with undefined batch size only. In the latter case, the `Placeholder` shape for the TensorFlow\* model looks like this `[-1, 224, 224, 3]`. @@ -173,55 +174,55 @@ Resulting Intermediate Representation will not be resizable with the help of Inf Launch the Model Optimizer for the Caffe bvlc_alexnet model with debug log level: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --log_level DEBUG +python3 mo.py --input_model bvlc_alexnet.caffemodel --log_level DEBUG --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with the output IR called `result.*` in the specified `output_dir`: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --model_name result --output_dir /../../models/ +python3 mo.py --input_model bvlc_alexnet.caffemodel --model_name result --output_dir /../../models/ ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with one input with scale values: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --scale_values [59,59,59] +python3 mo.py --input_model bvlc_alexnet.caffemodel --scale_values [59,59,59] --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with multiple inputs with scale values: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --scale_values [59,59,59],[5,5,5] +python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --scale_values [59,59,59],[5,5,5] --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with multiple inputs with scale and mean values specified for the particular nodes: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --mean_values data[59,59,59] --scale_values rois[5,5,5] +python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --mean_values data[59,59,59] --scale_values rois[5,5,5] --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with specified input layer, overridden input shape, scale 5, batch 8 and specified name of an output operation: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --input "data[1 3 224 224]" --output pool5 -s 5 -b 8 +python3 mo.py --input_model bvlc_alexnet.caffemodel --input "data[1 3 224 224]" --output pool5 -s 5 -b 8 --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with disabled fusing for linear operations to Convolution and grouped convolutions: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --disable_fusing --disable_gfusing +python3 mo.py --input_model bvlc_alexnet.caffemodel --disable_fusing --disable_gfusing --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with reversed input channels order between RGB and BGR, specified mean values to be used for the input image per channel and specified data type for input tensor values: ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16 +python3 mo.py --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16 --output_dir ``` Launch the Model Optimizer for the Caffe bvlc_alexnet model with extensions listed in specified directories, specified mean_images binaryproto. file For more information about extensions, please refer to [this](../customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md) page. ```sh -python3 mo.py --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto +python3 mo.py --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto --output_dir ``` Launch the Model Optimizer for TensorFlow* FaceNet* model with a placeholder freezing value. It replaces the placeholder with a constant layer that contains the passed value. For more information about FaceNet conversion, please refer to [this](tf_specific/Convert_FaceNet_From_Tensorflow.md) page ```sh -python3 mo.py --input_model FaceNet.pb --input "phase_train->False" +python3 mo.py --input_model FaceNet.pb --input "phase_train->False" --output_dir ``` Launch the Model Optimizer for any model with a placeholder freezing tensor of values. @@ -231,5 +232,5 @@ Tensor here is represented in square brackets with each value separated from ano If data type is set in the model, this tensor will be reshaped to a placeholder shape and casted to placeholder data type. Otherwise, it will be casted to data type passed to `--data_type` parameter (by default, it is FP32). ```sh -python3 mo.py --input_model FaceNet.pb --input "placeholder_layer_name->[0.1 1.2 2.3]" +python3 mo.py --input_model FaceNet.pb --input "placeholder_layer_name->[0.1 1.2 2.3]" --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md b/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md index a4bb4e98017276..d86368a9f708f5 100644 --- a/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md +++ b/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md @@ -37,10 +37,13 @@ In the TensorBoard, it looks the following way together with some predecessors: ![TensorBoard with predecessors](../../img/inception_v1_std_output.png) -Convert this model: +Convert this model and put the results in a writable output directory: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 +${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer +python3 mo.py --input_model inception_v1.pb -b 1 --output_dir ``` +(The other examples on this page assume that you first cd to the `model_optimizer` directory and add the `--output_dir` argument with a directory where you have write permissions.) + The output `.xml` file with an Intermediate Representation contains the `Input` layer among other layers in the model: ```xml @@ -78,9 +81,9 @@ The last layer in the model is `InceptionV1/Logits/Predictions/Reshape_1`, which ``` Due to automatic identification of inputs and outputs, you do not need to provide the `--input` and `--output` options to convert the whole model. The following commands are equivalent for the Inception V1 model: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 +python3 mo.py --input_model inception_v1.pb -b 1 --output_dir -python3 mo.py --input_model=inception_v1.pb -b 1 --input=input --output=InceptionV1/Logits/Predictions/Reshape_1 +python3 mo.py --input_model inception_v1.pb -b 1 --input input --output InceptionV1/Logits/Predictions/Reshape_1 --output_dir ``` The Intermediate Representations are identical for both conversions. The same is true if the model has multiple inputs and/or outputs. @@ -96,7 +99,7 @@ If you want to cut your model at the end, you have the following options: 1. The following command cuts off the rest of the model after the `InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu`, making this node the last in the model: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model inception_v1.pb -b 1 --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` The resulting Intermediate Representation has three layers: ```xml @@ -140,7 +143,7 @@ python3 mo.py --input_model=inception_v1.pb -b 1 --output=InceptionV1/InceptionV 2. The following command cuts the edge that comes from 0 output port of the `InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu` and the rest of the model, making this node the last one in the model: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu:0 +python3 mo.py --input_model inception_v1.pb -b 1 --output InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu:0 --output_dir ``` The resulting Intermediate Representation has three layers, which are the same as in the previous case: ```xml @@ -184,7 +187,7 @@ python3 mo.py --input_model=inception_v1.pb -b 1 --output=InceptionV1/InceptionV 3. The following command cuts the edge that comes to 0 input port of the `InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu` and the rest of the model including `InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu`, deleting this node and making the previous node `InceptionV1/InceptionV1/Conv2d_1a_7x7/Conv2D` the last in the model: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --output=0:InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model inception_v1.pb -b 1 --output=0:InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` The resulting Intermediate Representation has two layers, which are the same as the first two layers in the previous case: ```xml @@ -222,7 +225,7 @@ If you want to go further and cut the beginning of the model, leaving only the ` 1. You can use the following command line, where `--input` and `--output` specify the same node in the graph: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --input=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model=inception_v1.pb -b 1 --output InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --input InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` The resulting Intermediate Representation looks as follows: ```xml @@ -254,7 +257,7 @@ Even though `--input_shape` is not specified in the command line, the shapes for 2. You can cut edge incoming to layer by port number. To specify incoming port use notation `--input=port:input_node`. So, to cut everything before `ReLU` layer, cut edge incoming in port 0 of `InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu` node: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --input=0:InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model inception_v1.pb -b 1 --input 0:InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` The resulting Intermediate Representation looks as follows: ```xml @@ -286,7 +289,7 @@ Even though `--input_shape` is not specified in the command line, the shapes for 3. You can cut edge outcoming from layer by port number. To specify outcoming port use notation `--input=input_node:port`. So, to cut everything before `ReLU` layer, cut edge from `InceptionV1/InceptionV1/Conv2d_1a_7x7/BatchNorm/batchnorm/add_1` node to `ReLU`: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --input=InceptionV1/InceptionV1/Conv2d_1a_7x7/BatchNorm/batchnorm/add_1:0 --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model inception_v1.pb -b 1 --input InceptionV1/InceptionV1/Conv2d_1a_7x7/BatchNorm/batchnorm/add_1:0 --output InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` The resulting Intermediate Representation looks as follows: ```xml @@ -317,7 +320,7 @@ python3 mo.py --input_model=inception_v1.pb -b 1 --input=InceptionV1/InceptionV1 The input shape can be overridden with `--input_shape`. In this case, the shape is applied to the node referenced in `--input`, not to the original `Placeholder` in the model. For example, this command line ```sh -python3 mo.py --input_model=inception_v1.pb --input_shape=[1,5,10,20] --output=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --input=InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu +python3 mo.py --input_model inception_v1.pb --input_shape=[1,5,10,20] --output InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --input InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu --output_dir ``` gives the following shapes in the `Input` and `ReLU` layers: @@ -366,14 +369,14 @@ There are operations that contain more than one input ports. In the example cons Following this behavior, the Model Optimizer creates an `Input` layer for port 0 only, leaving port 1 as a constant. So the result of: ```sh -python3 mo.py --input_model=inception_v1.pb -b 1 --input=InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution +python3 mo.py --input_model inception_v1.pb -b 1 --input InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution --output_dir ``` is identical to the result of conversion of the model as a whole, because this convolution is the first executable operation in Inception V1. Different behavior occurs when `--input_shape` is also used as an attempt to override the input shape: ```sh -python3 mo.py --input_model=inception_v1.pb--input=InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution --input_shape=[1,224,224,3] +python3 mo.py --input_model inception_v1.pb--input=InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution --input_shape [1,224,224,3] --output_dir ``` An error occurs (for more information, see FAQ #30): ```sh @@ -385,5 +388,5 @@ In this case, when `--input_shape` is specified and the node contains multiple i The correct command line is: ```sh -python3 mo.py --input_model=inception_v1.pb --input=0:InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution --input_shape=[1,224,224,3] +python3 mo.py --input_model inception_v1.pb --input 0:InceptionV1/InceptionV1/Conv2d_1a_7x7/convolution --input_shape=[1,224,224,3] --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md b/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md index 50b0020ee2f161..eda5d768c47fed 100644 --- a/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md +++ b/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md @@ -5,7 +5,7 @@ Inference Engine CPU plugin can infer models in the 8-bit integer (INT8) precision. For details, refer to [INT8 inference on the CPU](../../../IE_DG/Int8Inference.md). -Intermediate Representation (IR) should be specifically formed to be suitable for the INT8 inference. +Intermediate Representation (IR) should be specifically formed to be suitable for INT8 inference. Such an IR is called an INT8 IR and you can generate it in two ways: - [Quantize model with the Post-Training Optimization tool](@ref pot_README) - Use the Model Optimizer for TensorFlow\* pre-TFLite models (`.pb` model file with `FakeQuantize*` operations) @@ -18,11 +18,11 @@ To execute the `Convolution` operation in INT8 on CPU, both data and weight inpu ![](../../img/expanded_int8_Convolution_weights.png) INT8 IR is also suitable for FP32 and FP16 inference if a chosen plugin supports all operations of the IR, because the only difference between an INT8 IR and FP16 or FP32 IR is the existence of `FakeQuantize` in the INT8 IR. -Plugins with the INT8 inference support recognize these sub-graphs and quantize them during the inference time. -Plugins without the INT8 support execute all operations, including `FakeQuantize`, as is in the FP32 or FP16 precision. +Plugins with INT8 inference support recognize these sub-graphs and quantize them during the inference time. +Plugins without INT8 support execute all operations, including `FakeQuantize`, as is in the FP32 or FP16 precision. Accordingly, the presence of FakeQuantize operations in the IR is a recommendation for a plugin on how to quantize particular operations in the model. -If capable, a plugin accepts the recommendation and performs the INT8 inference, otherwise the plugin ignores the recommendation and executes a model in the floating-point precision. +If capable, a plugin accepts the recommendation and performs INT8 inference, otherwise the plugin ignores the recommendation and executes a model in the floating-point precision. ## Compressed INT8 Weights diff --git a/docs/MO_DG/prepare_model/convert_model/kaldi_specific/Aspire_Tdnn_Model.md b/docs/MO_DG/prepare_model/convert_model/kaldi_specific/Aspire_Tdnn_Model.md index b4e0ea06651ccf..f6709865b5cfaf 100644 --- a/docs/MO_DG/prepare_model/convert_model/kaldi_specific/Aspire_Tdnn_Model.md +++ b/docs/MO_DG/prepare_model/convert_model/kaldi_specific/Aspire_Tdnn_Model.md @@ -1,7 +1,7 @@ # Convert Kaldi* ASpIRE Chain Time Delay Neural Network (TDNN) Model to the Intermediate Representation {#openvino_docs_MO_DG_prepare_model_convert_model_kaldi_specific_Aspire_Tdnn_Model} You can [download a pre-trained model](https://kaldi-asr.org/models/1/0001_aspire_chain_model.tar.gz) -for the ASpIRE Chain Time Delay Neural Network (TDNN) from the Kaldi* project official web-site. +for the ASpIRE Chain Time Delay Neural Network (TDNN) from the Kaldi* project official website. ## Convert ASpIRE Chain TDNN Model to IR diff --git a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_GluonCV_Models.md b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_GluonCV_Models.md index ae65c1b226155f..45d3db1842343c 100644 --- a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_GluonCV_Models.md +++ b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_GluonCV_Models.md @@ -2,7 +2,7 @@ This document provides the instructions and examples on how to use Model Optimizer to convert [GluonCV SSD and YOLO-v3 models](https://gluon-cv.mxnet.io/model_zoo/detection.html) to IR. -1. Choose the topology available from the [GluonCV Moodel Zoo](https://gluon-cv.mxnet.io/model_zoo/detection.html) and export to the MXNet format using the GluonCV API. For example, for the `ssd_512_mobilenet1.0` topology: +1. Choose the topology available from the [GluonCV Model Zoo](https://gluon-cv.mxnet.io/model_zoo/detection.html) and export to the MXNet format using the GluonCV API. For example, for the `ssd_512_mobilenet1.0` topology: ```python from gluoncv import model_zoo, data, utils from gluoncv.utils import export_block @@ -13,14 +13,14 @@ As a result, you will get an MXNet model representation in `ssd_512_mobilenet1.0 2. Run the Model Optimizer tool specifying the `--enable_ssd_gluoncv` option. Make sure the `--input_shape` parameter is set to the input shape layout of your model (NHWC or NCHW). The examples below illustrates running the Model Optimizer for the SSD and YOLO-v3 models trained with the NHWC layout and located in the ``: * **For GluonCV SSD topologies:** ```sh -python3 mo_mxnet.py --input_model /ssd_512_mobilenet1.0.params --enable_ssd_gluoncv --input_shape [1,512,512,3] --input data +python3 mo_mxnet.py --input_model /ssd_512_mobilenet1.0.params --enable_ssd_gluoncv --input_shape [1,512,512,3] --input data --output_dir ``` * **For YOLO-v3 topology:** * To convert the model: ```sh - python3 mo_mxnet.py --input_model /yolo3_mobilenet1.0_voc-0000.params --input_shape [1,255,255,3] + python3 mo_mxnet.py --input_model /yolo3_mobilenet1.0_voc-0000.params --input_shape [1,255,255,3] --output_dir ``` * To convert the model with replacing the subgraph with RegionYolo layers: ```sh - python3 mo_mxnet.py --input_model /models/yolo3_mobilenet1.0_voc-0000.params --input_shape [1,255,255,3] --transformations_config "mo/extensions/front/mxnet/yolo_v3_mobilenet1_voc.json" + python3 mo_mxnet.py --input_model /models/yolo3_mobilenet1.0_voc-0000.params --input_shape [1,255,255,3] --transformations_config "mo/extensions/front/mxnet/yolo_v3_mobilenet1_voc.json" --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md index f6a9f189750598..f0ec23d5a9f631 100644 --- a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md +++ b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md @@ -11,6 +11,8 @@ To use the style transfer sample from OpenVINO™, follow the steps below as sudo apt-get install python-tk ``` +Installing python-tk step is needed only for Linux, as it is included by default in Python\* for Windows\*. + 2. Install Python\* requirements: ```sh pip3 install --user mxnet @@ -64,32 +66,31 @@ arg_dict.update(args) 6. Use `arg_dict` instead of `args` as a parameter of the `decoder.bind()` function. Replace the line:
```py -self.deco_executor = decoder.bind(ctx=mx.cpu(), args=args, aux_states=auxs) +self.deco_executor = decoder.bind(ctx=mx.gpu(), args=args, aux_states=auxs) ``` with the following:
```py self.deco_executor = decoder.bind(ctx=mx.cpu(), args=arg_dict, aux_states=auxs) ``` -7. Replace all `mx.gpu` with `mx.cpu` in the `decoder.bind()` function. -8. To save the result model as a `.json` file, add the following code to the end of the `generate()` function in the `Maker` class:
+7. To save the result model as a `.json` file, add the following code to the end of the `generate()` function in the `Maker` class:
```py self.vgg_executor._symbol.save('{}-symbol.json'.format('vgg19')) self.deco_executor._symbol.save('{}-symbol.json'.format('nst_vgg19')) ``` -9. Save and close the `make_image.py` file. +8. Save and close the `make_image.py` file. -#### 5. Run the sample with a decoder model according to the instructions from the `README.md` file in the cloned repository. +#### 5. Run the sample with a decoder model according to the instructions from the `README.md` file in the `fast_mrf_cnn` directory of the cloned repository. For example, to run the sample with the pre-trained decoder weights from the `models` folder and output shape, use the following code:
```py import make_image maker = make_image.Maker('models/13', (1024, 768)) maker.generate('output.jpg', '../images/tubingen.jpg') ``` -Where `'models/13'` string is composed of the following sub-strings: -* `'models/'` - path to the folder that contains .nd files with pre-trained styles weights and `'13'` -* Decoder prefix: the repository contains a default decoder, which is the 13_decoder. +Where the `models/13` string is composed of the following substrings: +* `models/`: path to the folder that contains .nd files with pre-trained styles weights +* `13`: prefix pointing to 13_decoder, which is the default decoder for the repository -You can choose any style from [collection of pre-trained weights](https://pan.baidu.com/s/1skMHqYp). The `generate()` function generates `nst_vgg19-symbol.json` and `vgg19-symbol.json` files for the specified shape. In the code, it is [1024 x 768] for a 4:3 ratio, and you can specify another, for example, [224,224] for a square ratio. +You can choose any style from [collection of pre-trained weights](https://pan.baidu.com/s/1skMHqYp). (On the Chinese-language page, click the down arrow next to a size in megabytes. Then wait for an overlay box to appear, and click the blue button in it to download.) The `generate()` function generates `nst_vgg19-symbol.json` and `vgg19-symbol.json` files for the specified shape. In the code, it is [1024 x 768] for a 4:3 ratio, and you can specify another, for example, [224,224] for a square ratio. #### 6. Run the Model Optimizer to generate an Intermediate Representation (IR): diff --git a/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_DLRM.md b/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_DLRM.md index 2d12ee0a1e7b02..341ad42c955b5b 100644 --- a/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_DLRM.md +++ b/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_DLRM.md @@ -4,7 +4,7 @@ These instructions are applicable only to the DLRM converted to the ONNX* file format from the [facebookresearch/dlrm model](https://github.com/facebookresearch/dlrm). -**Step 1**. Save trained Pytorch* model to ONNX* format. If you training model using [script provided in model repository](https://github.com/facebookresearch/dlrm/blob/master/dlrm_s_pytorch.py) just add `--save-onnx` flag to the command line parameters and you'll get `dlrm_s_pytorch.onnx` file containing model serialized in ONNX* format. +**Step 1**. Save trained Pytorch* model to ONNX* format. If you train the model using the [script provided in model repository](https://github.com/facebookresearch/dlrm/blob/master/dlrm_s_pytorch.py), just add the `--save-onnx` flag to the command line parameters and you'll get the `dlrm_s_pytorch.onnx` file containing the model serialized in ONNX* format. **Step 2**. To generate the Intermediate Representation (IR) of the model, change your current working directory to the Model Optimizer installation directory and run the Model Optimizer with the following parameters: ```sh diff --git a/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_GPT2.md b/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_GPT2.md index c2117ee516877b..cd9c49f46f416b 100644 --- a/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_GPT2.md +++ b/docs/MO_DG/prepare_model/convert_model/onnx_specific/Convert_GPT2.md @@ -13,5 +13,5 @@ To download the model and sample test data, click **Download** on [https://githu To generate the Intermediate Representation (IR) of the model GPT-2, run the Model Optimizer with the following parameters: ```sh -python3 mo.py --input_model gpt2-10.onnx --input_shape [X,Y,Z] +python3 mo.py --input_model gpt2-10.onnx --input_shape [X,Y,Z] --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md index a3f244fa55f3ad..ffb16eb5f7cc5f 100644 --- a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md +++ b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md @@ -4,7 +4,7 @@ ## Download and Convert the Model to ONNX* -To download the pretrained model or train the model yourself, refer to the +To download the pre-trained model or train the model yourself, refer to the [instruction](https://github.com/weijun88/F3Net/blob/master/README.md) in the F3Net model repository. Firstly, convert the model to ONNX\* format. Create and run the script with the following content in the `src` directory of the model repository: diff --git a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md index ce0e582875c4df..ed072ac64f45a4 100644 --- a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md +++ b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md @@ -136,7 +136,7 @@ git clone https://github.com/dbolya/yolact git checkout 57b8f2d95e62e2e649b382f516ab41f949b57239 ``` -**Step 2**. Download a pretrained model, for example `yolact_base_54_800000.pth`. +**Step 2**. Download a pre-trained model, for example `yolact_base_54_800000.pth`. **Step 3**. Export the model to ONNX* format. diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_CRNN_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_CRNN_From_Tensorflow.md index 906ca8c1e4d3cc..efb930a4e1e571 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_CRNN_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_CRNN_From_Tensorflow.md @@ -9,7 +9,7 @@ have another implementation of CRNN model, you can convert it to IR in similar w **To convert this model to the IR:** **Step 1.** Clone this GitHub repository and checkout the commit: - 1. Clone reposirory: + 1. Clone repository: ```sh git clone https://github.com/MaybeShewill-CV/CRNN_Tensorflow.git ``` @@ -18,7 +18,7 @@ have another implementation of CRNN model, you can convert it to IR in similar w git checkout 64f1f1867bffaacfeacc7a80eebf5834a5726122 ``` -**Step 2.** Train the model using framework or use the pretrained checkpoint provided in this repository. +**Step 2.** Train the model using framework or use the pre-trained checkpoint provided in this repository. **Step 3.** Create an inference graph: 1. Go to the `CRNN_Tensorflow` directory with the cloned repository: @@ -31,7 +31,7 @@ cd path/to/CRNN_Tensorflow export PYTHONPATH="${PYTHONPATH}:/path/to/CRNN_Tensorflow/" ``` * For Windows\* OS add `/path/to/CRNN_Tensorflow/` to the `PYTHONPATH` environment variable in settings. - 3. Open the `tools/demo_shadownet.py` script. After `saver.restore(sess=sess, save_path=weights_path)` line, add the following code: + 3. Open the `tools/test_shadownet.py` script. After `saver.restore(sess=sess, save_path=weights_path)` line, add the following code: ```python from tensorflow.python.framework import graph_io frozen = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['shadow/LSTMLayers/transpose_time_major']) @@ -39,7 +39,7 @@ graph_io.write_graph(frozen, '.', 'frozen_graph.pb', as_text=False) ``` 4. Run the demo with the following command: ```sh -python tools/demo_shadownet.py --image_path data/test_images/test_01.jpg --weights_path model/shadownet/shadownet_2017-10-17-11-47-46.ckpt-199999 +python tools/test_shadownet.py --image_path data/test_images/test_01.jpg --weights_path model/shadownet/shadownet_2017-10-17-11-47-46.ckpt-199999 ``` If you want to use your checkpoint, replace the path in the `--weights_path` parameter with a path to your checkpoint. 5. In the `CRNN_Tensorflow` directory, you will find the inference CRNN graph `frozen_graph.pb`. You can use this graph with the OpenVINO™ toolkit diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_DeepSpeech_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_DeepSpeech_From_Tensorflow.md index 4b8bd1e40484f8..74833cf3ad3332 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_DeepSpeech_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_DeepSpeech_From_Tensorflow.md @@ -34,20 +34,15 @@ Pre-trained frozen model file is `output_graph.pb`. As you can see, the frozen model still has two variables: `previous_state_c` and `previous_state_h`. It means that the model keeps training those variables at each inference. -At the first inference of this graph, the variables are initialized by zero tensors. After executing the -`lstm_fused_cell` nodes, cell state and hidden state, which are the results of the `BlockLSTM` execution, -are assigned to these two variables. +At the first inference of this graph, the variables are initialized by zero tensors. After executing the `lstm_fused_cell` nodes, cell state and hidden state, which are the results of the `BlockLSTM` execution, are assigned to these two variables. -With each inference of the DeepSpeech graph, initial cell state and hidden state data for `BlockLSTM` is taken -from previous inference from variables. Outputs (cell state and hidden state) of `BlockLSTM` are reassigned -to the same variables. +With each inference of the DeepSpeech graph, initial cell state and hidden state data for `BlockLSTM` is taken from previous inference from variables. Outputs (cell state and hidden state) of `BlockLSTM` are reassigned to the same variables. It helps the model to remember the context of the words that it takes as input. ## Convert the TensorFlow* DeepSpeech Model to IR -The Model Optimizer assumes that the output model is for inference only. That is why you should cut those variables off and -resolve keeping cell and hidden states on the application level. +The Model Optimizer assumes that the output model is for inference only. That is why you should cut those variables off and resolve keeping cell and hidden states on the application level. There are certain limitations for the model conversion: - Time length (`time_len`) and sequence length (`seq_len`) are equal. @@ -55,11 +50,11 @@ There are certain limitations for the model conversion: To generate the DeepSpeech Intermediate Representation (IR), provide the TensorFlow DeepSpeech model to the Model Optimizer with the following parameters: ```sh -python3 ./mo_tf.py ---input_model path_to_model/output_graph.pb \ ---freeze_placeholder_with_value input_lengths->[16] \ ---input input_node,previous_state_h/read,previous_state_c/read \ ---input_shape [1,16,19,26],[1,2048],[1,2048] \ +python3 ./mo_tf.py \ +--input_model path_to_model/output_graph.pb \ +--freeze_placeholder_with_value input_lengths->[16] \ +--input input_node,previous_state_h/read,previous_state_c/read \ +--input_shape [1,16,19,26],[1,2048],[1,2048] \ --output raw_logits,lstm_fused_cell/GatherNd,lstm_fused_cell/GatherNd_1 \ --disable_nhwc_to_nchw ``` diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md index 6362f018132c6c..b78ec640cba19c 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md @@ -1,6 +1,6 @@ # Converting EfficientDet Models from TensorFlow {#openvino_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_EfficientDet_Models} -This tutorial explains how to convert detection EfficientDet\* public models to the Intermediate Representation (IR). +This tutorial explains how to convert EfficientDet\* public object detection models to the Intermediate Representation (IR). ## Convert EfficientDet Model to IR @@ -24,10 +24,11 @@ git checkout 96e1fee 3. Install required dependencies:
```sh python3 -m pip install --upgrade pip -python3 -m pip install -r automl/efficientdet/requirements.txt +python3 -m pip install -r requirements.txt +python3 -m pip install --upgrade tensorflow-model-optimization ``` 4. Download and extract the model checkpoint [efficientdet-d4.tar.gz](https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco2/efficientdet-d4.tar.gz) -referenced in the "Pretrained EfficientDet Checkpoints" section of the model repository:
+referenced in the "Pre-trained EfficientDet Checkpoints" section of the model repository:
```sh wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco2/efficientdet-d4.tar.gz tar zxvf efficientdet-d4.tar.gz @@ -46,9 +47,9 @@ As a result the frozen model file `savedmodeldir/efficientdet-d4_frozen.pb` will To generate the IR of the EfficientDet TensorFlow model, run:
```sh -python3 $MO_ROOT/mo.py \ +python3 $INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \ --input_model savedmodeldir/efficientdet-d4_frozen.pb \ ---transformations_config $MO_ROOT/extensions/front/tf/automl_efficientdet.json \ +--transformations_config $INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/extensions/front/tf/automl_efficientdet.json \ --input_shape [1,$IMAGE_SIZE,$IMAGE_SIZE,3] \ --reverse_input_channels ``` diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md index 587e2f53db344d..72fe1db6aa8de5 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md @@ -137,7 +137,7 @@ index f5823d8..a733748 100644 ``` 3. Save and close the file. -## Convert GNMT Model to the IR +## Convert GNMT Model to IR > **NOTE**: Please, use TensorFlow version 1.13 or lower. @@ -155,7 +155,7 @@ git checkout b278487980832417ad8ac701c672b5c3dc7fa553 **Step 2**. Get a trained model. You have two options: * Train the model with the GNMT `wmt16_gnmt_4_layer.json` or `wmt16_gnmt_8_layer.json` configuration file using the NMT framework. -* Use the pretrained checkpoints provided in the NMT repository. Refer to the [Benchmarks](https://github.com/tensorflow/nmt#benchmarks) section for more information (*checkpoints in this section are outdated and can be incompatible with the current repository version. To avoid confusion, train a model by yourself*). +* *Do not use the pre-trained checkpoints provided in the NMT repository, as they are outdated and can be incompatible with the current repository version.* This tutorial assumes the use of the trained GNMT model from `wmt16_gnmt_4_layer.json` config, German to English translation. diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_NCF_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_NCF_From_Tensorflow.md index 6e03abe921d71a..c9526ef9da0787 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_NCF_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_NCF_From_Tensorflow.md @@ -2,8 +2,7 @@ This tutorial explains how to convert Neural Collaborative Filtering (NCF) model to Intermediate Representation (IR). -[Public TensorFlow NCF model](https://github.com/tensorflow/models/tree/master/official/recommendation) does not contain - pretrained weights. To convert this model to the IR: +[Public TensorFlow NCF model](https://github.com/tensorflow/models/tree/master/official/recommendation) does not contain pre-trained weights. To convert this model to the IR: 1. Use [the instructions](https://github.com/tensorflow/models/tree/master/official/recommendation#train-and-evaluate-model) from this repository to train the model. 2. Freeze the inference graph you get on previous step in `model_dir` following the instructions from the Freezing Custom Models in Python* section of @@ -24,25 +23,27 @@ graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False) where `rating/BiasAdd` is an output node. 3. Convert the model to the IR.If you look at your frozen model, you can see that -it has one input that is split to four `ResourceGather` layers. +it has one input that is split into four `ResourceGather` layers. (Click image to zoom in.) ![NCF model beginning](../../../img/NCF_start.png) But as the Model Optimizer does not support such data feeding, you should skip it. Cut the edges incoming in `ResourceGather`s port 1: ```sh -python3 mo_tf.py --input_model inference_graph.pb \ ---input 1:embedding/embedding_lookup,1:embedding_1/embedding_lookup,\ -1:embedding_2/embedding_lookup,1:embedding_3/embedding_lookup \ ---input_shape [256],[256],[256],[256] +python3 mo_tf.py --input_model inference_graph.pb \ +--input 1:embedding/embedding_lookup,1:embedding_1/embedding_lookup, \ +1:embedding_2/embedding_lookup,1:embedding_3/embedding_lookup \ +--input_shape [256],[256],[256],[256] \ +--output_dir ``` -Where 256 is a `batch_size` you choose for your model. +In the `input_shape` parameter, 256 specifies the `batch_size` for your model. Alternatively, you can do steps 2 and 3 in one command line: ```sh -python3 mo_tf.py --input_meta_graph /path/to/model/model.meta \ ---input 1:embedding/embedding_lookup,1:embedding_1/embedding_lookup,\ -1:embedding_2/embedding_lookup,1:embedding_3/embedding_lookup \ ---input_shape [256],[256],[256],[256] --output rating/BiasAdd +python3 mo_tf.py --input_meta_graph /path/to/model/model.meta \ +--input 1:embedding/embedding_lookup,1:embedding_1/embedding_lookup, \ +1:embedding_2/embedding_lookup,1:embedding_3/embedding_lookup \ +--input_shape [256],[256],[256],[256] --output rating/BiasAdd \ +--output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md index 6683d6b9b8a887..6feec5f627a82e 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md @@ -8,8 +8,7 @@ With 2018 R3 release, the Model Optimizer introduces a new approach to convert models created using the TensorFlow\* Object Detection API. Compared with the previous approach, the new process produces inference results with higher accuracy and does not require modifying any configuration files and providing intricate command line parameters. -You can download TensorFlow\* Object Detection API models from the TensorFlow 1 Detection Model Zoo -or TensorFlow 2 Detection Model Zoo. +You can download TensorFlow\* Object Detection API models from the TensorFlow 1 Detection Model Zoo or TensorFlow 2 Detection Model Zoo. NOTE: Before converting, make sure you have configured the Model Optimizer. For configuration steps, refer to [Configuring the Model Optimizer](../../Config_Model_Optimizer.md). @@ -56,7 +55,7 @@ For example, if you downloaded the [pre-trained SSD InceptionV2 topology](http:/ ``` ## Custom Input Shape -Model Optimizer handles command line parameter `--input_shape` for TensorFlow\* Object Detection API models in a special way depending on the image resizer type defined in the `pipeline.config` file. TensorFlow\* Object Detection API generates different `Preprocessor` sub-graph based on the image resizer type. Model Optimizer supports two types of image resizer: +Model Optimizer handles the command line parameter `--input_shape` for TensorFlow\* Object Detection API models in a special way depending on the image resizer type defined in the `pipeline.config` file. TensorFlow\* Object Detection API generates different `Preprocessor` sub-graph based on the image resizer type. Model Optimizer supports two types of image resizer: * `fixed_shape_resizer` --- *Stretches* input image to the specific height and width. The `pipeline.config` snippet below shows a `fixed_shape_resizer` sample definition: ``` image_resizer { diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md index 84821d6b41c87c..ba781f17880602 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md @@ -19,6 +19,9 @@ git clone https://github.com/tensorflow/models.git --branch r2.2.0; cd official/r1/wide_deep ``` +The Wide and Deep model is no longer in the master branch of the repository but is still available in the r2.2.0 branch. + + **Step 2**. Train the model As the OpenVINO™ toolkit does not support the categorical with hash and crossed features, such feature types must be switched off in the model diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md index 493f05ba8546ac..cc121ab19e1ad9 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md @@ -35,7 +35,7 @@ To get pb-file from the archive contents, you need to do the following. -2. Save and run the following script: +2. Save and run the following Python script in `~/XLNet-Base/xlnet`: ```python from collections import namedtuple @@ -92,7 +92,6 @@ with tf.Session() as sess: writer.flush() ``` -The script should save into `~/XLNet-Base/xlnet`. ## Download the Pre-Trained Large XLNet Model @@ -120,7 +119,7 @@ To get pb-file from the archive contents, you need to do the following. -2. Save and run the following script: +2. Save and run the following Python script in `~/XLNet-Large/xlnet`: ```python from collections import namedtuple @@ -185,6 +184,6 @@ The script should save into `~/XLNet-Large/xlnet`. To generate the XLNet Intermediate Representation (IR) of the model, run the Model Optimizer with the following parameters: ```sh -python3 mo.py --input_model path-to-model/model_frozen.pb --input "input_mask[50 1],input_ids[50 1],seg_ids[50 1]" --log_level DEBUG --disable_nhwc_to_nchw +python3 mo.py --input_model path-to-model/model_frozen.pb --input "input_mask[50 1],input_ids[50 1],seg_ids[50 1]" --log_level DEBUG --disable_nhwc_to_nchw --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md index 109714dcea6b64..653165576ce125 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md @@ -35,7 +35,7 @@ cd tensorflow-yolo-v3 git checkout ed60b90 ``` 3. Download [coco.names](https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names) file from the DarkNet website **OR** use labels that fit your task. -4. Download the [yolov3.weights](https://pjreddie.com/media/files/yolov3.weights) (for the YOLOv3 model) or [yolov3-tiny.weights](https://pjreddie.com/media/files/yolov3-tiny.weights) (for the YOLOv3-tiny model) file **OR** use your pretrained weights with the same structure +4. Download the [yolov3.weights](https://pjreddie.com/media/files/yolov3.weights) (for the YOLOv3 model) or [yolov3-tiny.weights](https://pjreddie.com/media/files/yolov3-tiny.weights) (for the YOLOv3-tiny model) file **OR** use your pre-trained weights with the same structure 5. Run a converter: - for YOLO-v3: ```sh @@ -89,18 +89,20 @@ where: To generate the IR of the YOLOv3 TensorFlow model, run:
```sh -python3 mo_tf.py ---input_model /path/to/yolo_v3.pb ---transformations_config $MO_ROOT/extensions/front/tf/yolo_v3.json ---batch 1 +python3 mo_tf.py \ +--input_model /path/to/yolo_v3.pb \ +--transformations_config $MO_ROOT/extensions/front/tf/yolo_v3.json \ +--batch 1 \ +--output_dir ``` To generate the IR of the YOLOv3-tiny TensorFlow model, run:
```sh -python3 mo_tf.py ---input_model /path/to/yolo_v3_tiny.pb ---transformations_config $MO_ROOT/extensions/front/tf/yolo_v3_tiny.json ---batch 1 +python3 mo_tf.py \ +--input_model /path/to/yolo_v3_tiny.pb \ +--transformations_config $MO_ROOT/extensions/front/tf/yolo_v3_tiny.json \ +--batch 1 \ +--output_dir ``` where: diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md index 99b4cd703c1977..cda8458e4dd72f 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md @@ -32,8 +32,7 @@ - Generic Back Phase Transformations - See Also -Model Optimizer extensibility mechanism enables support of new operations and custom transformations to generate the -optimized intermediate representation (IR) as described in the +Model Optimizer extensibility mechanism enables support of new operations and custom transformations to generate the optimized intermediate representation (IR) as described in the [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../../IR_and_opsets.md). This mechanism is a core part of the Model Optimizer. The Model Optimizer itself uses it under the hood, being a huge set of examples on how to add custom logic to support your model. @@ -42,9 +41,8 @@ There are several cases when the customization is needed: * A model contains operation(s) not known for the Model Optimizer, but these operation(s) could be expressed as a combination of supported operations. In this case, a custom transformation should be implemented to replace unsupported operation(s) with supported ones. -* A model contains sub-graph of operations that can be replaced with a smaller number of operations to get the better -performance. This example corresponds to so called fusing transformations. For example, replace a sub-graph performing -the following calculation \f$x / (1.0 + e^{-(beta * x)})\f$ with a single operation of type +* A model contains a sub-graph of operations that can be replaced with a smaller number of operations to get better +performance. This example corresponds to so-called *fusing transformations*, for example, replacing a sub-graph performing the calculation \f$x / (1.0 + e^{-(beta * x)})\f$ with a single operation of type [Swish](../../../ops/activation/Swish_4.md). * A model contains a custom framework operation (the operation that is not a part of an official operation set of the framework) that was developed using the framework extensibility mechanism. In this case, the Model Optimizer should know diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_MXNet_Model_Optimizer_with_New_Primitives.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_MXNet_Model_Optimizer_with_New_Primitives.md index aa3b5697242657..e20a44969cfc83 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_MXNet_Model_Optimizer_with_New_Primitives.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_MXNet_Model_Optimizer_with_New_Primitives.md @@ -1,12 +1,11 @@ # Extending Model Optimizer for Custom MXNet* Operations {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Extending_MXNet_Model_Optimizer_with_New_Primitives} -This section provides instruction on how to support a custom MXNet operation (or as it called in the MXNet documentation -"operator" or "layer") which is not a part of the MXNet operation set. For example, if the operator is implemented using -the following [guide](https://mxnet.apache.org/versions/1.7.0/api/faq/new_op.html). +This section provides instruction on how to support a custom MXNet operation (in the MXNet documentation, called an *operator* or *layer*) that is not part of the MXNet operation set. Creating custom operations is described in +[this guide](https://mxnet.apache.org/versions/1.7.0/api/faq/new_op.html). This section describes a procedure on how to extract operator attributes in the Model Optimizer. The rest of the -operation enabling pipeline and documentation on how to support MXNet operations from standard MXNet operation set is -described in the main document [Customize_Model_Optimizer](Customize_Model_Optimizer.md). +operation-enabling pipeline and documentation on how to support MXNet operations from standard MXNet operation set is +described in the main [Customize_Model_Optimizer](Customize_Model_Optimizer.md) document. ## Writing Extractor for Custom MXNet Operation Custom MXNet operations have an attribute `op` (defining the type of the operation) equal to `Custom` and attribute diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_Caffe_Python_Layers.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_Caffe_Python_Layers.md index c79da3ef0efaa0..e4a71a8fdc9298 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_Caffe_Python_Layers.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_Caffe_Python_Layers.md @@ -1,10 +1,9 @@ # Extending Model Optimizer with Caffe* Python Layers {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Extending_Model_Optimizer_With_Caffe_Python_Layers} This section provides instruction on how to support a custom Caffe operation written only in Python. For example, the -[Faster-R-CNN model]((http://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz?dl=0)) implemented in -Caffe contains a custom layer Proposal written in Python. The layer is described in the -[Faster-R-CNN protoxt](https://raw.githubusercontent.com/rbgirshick/py-faster-rcnn/master/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt) -the following way: +[Faster-R-CNN model](http://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz?dl=0) implemented in +Caffe contains a custom proposal layer written in Python. The layer is described in the +[Faster-R-CNN prototxt](https://raw.githubusercontent.com/rbgirshick/py-faster-rcnn/master/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt) in the following way: ```sh layer { name: 'proposal' diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md index 9fb0e9b26f2db7..bb7ef070f38633 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md @@ -1,3 +1,3 @@ -# Extending Model Optimizer with New Primitives {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Extending_Model_Optimizer_with_New_Primitives} +# [DEPRECATED] Extending Model Optimizer with New Primitives {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Extending_Model_Optimizer_with_New_Primitives} -This page is deprecated. Please, refer to [Model Optimizer Extensibility](Customize_Model_Optimizer.md) page for more information. +This page is deprecated. Please refer to [Model Optimizer Extensibility](Customize_Model_Optimizer.md) page for more information. diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md index c106d489ea8af7..6b04781946193f 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md @@ -1,6 +1,6 @@ -# Legacy Mode for Caffe* Custom Layers {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Legacy_Mode_for_Caffe_Custom_Layers} +# [DEPRECATED] Legacy Mode for Caffe* Custom Layers {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Legacy_Mode_for_Caffe_Custom_Layers} -> **NOTE**: This functionality is deprecated and will be removed in the future releases. +> **NOTE: This functionality is deprecated and will be removed in future releases.** Model Optimizer can register custom layers in a way that the output shape is calculated by the Caffe\* framework installed on your system. This approach has several limitations: diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md index 70bec8bdb4f91c..4883d2f3e09577 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md @@ -1,4 +1,4 @@ -# Sub-Graph Replacement in the Model Optimizer {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Subgraph_Replacement_Model_Optimizer} +# [DEPRECATED] Sub-Graph Replacement in the Model Optimizer {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Subgraph_Replacement_Model_Optimizer} The document has been deprecated. Refer to the [Model Optimizer Extensibility](Customize_Model_Optimizer.md) for the up-to-date documentation. diff --git a/docs/get_started/get_started_dl_workbench.md b/docs/get_started/get_started_dl_workbench.md index 795767f3c73fa4..701f23f66d60e3 100644 --- a/docs/get_started/get_started_dl_workbench.md +++ b/docs/get_started/get_started_dl_workbench.md @@ -10,7 +10,7 @@ In this guide, you will: [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a web-based graphical environment that enables you to easily use various sophisticated OpenVINO™ toolkit components: * [Model Downloader](@ref omz_tools_downloader) to download models from the [Intel® Open Model Zoo](@ref omz_models_group_intel) -with pretrained models for a range of different tasks +with pre-trained models for a range of different tasks * [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) to transform models into the Intermediate Representation (IR) format * [Post-Training Optimization toolkit](@ref pot_README) to calibrate a model and then execute it in the @@ -70,7 +70,7 @@ The simplified OpenVINO™ DL Workbench workflow is: ## Run Baseline Inference -This section illustrates a sample use case of how to infer a pretrained model from the [Intel® Open Model Zoo](@ref omz_models_group_intel) with an autogenerated noise dataset on a CPU device. +This section illustrates a sample use case of how to infer a pre-trained model from the [Intel® Open Model Zoo](@ref omz_models_group_intel) with an autogenerated noise dataset on a CPU device. \htmlonly \endhtmlonly @@ -82,7 +82,7 @@ Once you log in to the DL Workbench, create a project, which is a combination of On the the **Active Projects** page, click **Create** to open the **Create Project** page: ![](./dl_workbench_img/create_configuration.png) -### Step 2. Choose a Pretrained Model +### Step 2. Choose a Pre-trained Model Click **Import** next to the **Model** table on the **Create Project** page. The **Import Model** page opens. Select the squeezenet1.1 model from the Open Model Zoo and click **Import**. ![](./dl_workbench_img/import_model_02.png) diff --git a/docs/get_started/get_started_linux.md b/docs/get_started/get_started_linux.md index 3aa945a05a1d32..d64d63ed2fccf9 100644 --- a/docs/get_started/get_started_linux.md +++ b/docs/get_started/get_started_linux.md @@ -94,6 +94,13 @@ The script:
Click for an example of running the Image Classification demo script +To preview the image that the script will classify: + +```sh +cd ${INTEL_OPENVINO_DIR}/deployment_tools/demo +eog car.png +``` + To run the script to perform inference on a CPU: ```sh @@ -173,11 +180,12 @@ The script:
Click for an example of running the Benchmark demo script -To run the script that performs inference on Intel® Vision Accelerator Design with Intel® Movidius™ VPUs: +To run the script that performs inference (runs on CPU by default): ```sh -./demo_squeezenet_download_convert_run.sh -d HDDL +./demo_benchmark_app.sh ``` + When the verification script completes, you see the performance counters, resulting latency, and throughput values displayed on the screen.
@@ -514,6 +522,24 @@ source /opt/intel/openvino_2021/bin/setupvars.sh ## Typical Code Sample and Demo Application Syntax Examples +This section explains how to build and use the sample and demo applications provided with the toolkit. You will need CMake 3.10 or later installed. Build details are on the [Inference Engine Samples](../IE_DG/Samples_Overview.md) and [Demo Applications](@ref omz_demos_README) pages. + +To build all the demos and samples: + +```sh +cd $INTEL_OPENVINO_DIR/inference_engine_samples/cpp +# to compile C samples, go here also: cd /inference_engine/samples/c +build_samples.sh +cd $INTEL_OPENVINO_DIR/deployment_tools/open_model_zoo/demos +build_demos.sh +``` + +Depending on what you compiled, executables are in the directories below: + +* `~/inference_engine_samples_build/intel64/Release` +* `~/inference_engine_cpp_samples_build/intel64/Release` +* `~/inference_engine_demos_build/intel64/Release` + Template to call sample code or a demo application: ```sh diff --git a/docs/get_started/get_started_macos.md b/docs/get_started/get_started_macos.md index 980b02d0be24ae..a15240a1c9b9c4 100644 --- a/docs/get_started/get_started_macos.md +++ b/docs/get_started/get_started_macos.md @@ -95,9 +95,10 @@ The script:
Click for an example of running the Image Classification demo script -To run the script to perform inference on a CPU: +To run the script to view the sample image and perform inference on the CPU: ```sh +open car.png ./demo_squeezenet_download_convert_run.sh ``` @@ -171,7 +172,7 @@ The script: To run the script that performs inference on a CPU: ```sh -./demo_squeezenet_download_convert_run.sh +./demo_benchmark_app.sh ``` When the verification script completes, you see the performance counters, resulting latency, and throughput values displayed on the screen.
@@ -210,7 +211,7 @@ You must have a model that is specific for you inference task. Example model typ - Custom (Often based on SSD) Options to find a model suitable for the OpenVINO™ toolkit are: -- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader). +- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using the [Model Downloader tool](@ref omz_tools_downloader). - Download from GitHub*, Caffe* Zoo, TensorFlow* Zoo, and other resources. - Train your own model. @@ -312,6 +313,8 @@ Models in the Intermediate Representation format always include a pair of `.xml` - **REQUIRED:** `model_name.xml` - **REQUIRED:** `model_name.bin` +The conversion may also create a `model_name.mapping` file, but it is not needed for running inference. + This guide uses the public SqueezeNet 1.1 Caffe\* model to run the Image Classification Sample. See the example to download a model in the Download Models section to learn how to download this model. The `squeezenet1.1` model is downloaded in the Caffe* format. You must use the Model Optimizer to convert the model to the IR. @@ -376,7 +379,7 @@ To run the **Image Classification** code sample with an input image on the IR: ``` 3. Run the code sample executable, specifying the input media file, the IR of your model, and a target device on which you want to perform inference: ```sh - classification_sample_async -i -m -d + ./classification_sample_async -i -m -d ```
Click for examples of running the Image Classification code sample on different devices @@ -473,6 +476,24 @@ source /opt/intel/openvino_2021/bin/setupvars.sh ## Typical Code Sample and Demo Application Syntax Examples +This section explains how to build and use the sample and demo applications provided with the toolkit. You will need CMake 3.13 or later installed. Build details are on the [Inference Engine Samples](../IE_DG/Samples_Overview.md) and [Demo Applications](@ref omz_demos_README) pages. + +To build all the demos and samples: + +```sh +cd $INTEL_OPENVINO_DIR/inference_engine_samples/cpp +# to compile C samples, go here also: cd /inference_engine/samples/c +build_samples.sh +cd $INTEL_OPENVINO_DIR/deployment_tools/open_model_zoo/demos +build_demos.sh +``` + +Depending on what you compiled, executables are in the directories below: + +* `~/inference_engine_samples_build/intel64/Release` +* `~/inference_engine_cpp_samples_build/intel64/Release` +* `~/inference_engine_demos_build/intel64/Release` + Template to call sample code or a demo application: ```sh @@ -482,8 +503,8 @@ Template to call sample code or a demo application: With the sample information specified, the command might look like this: ```sh -./object_detection_demo_ssd_async -i ~/Videos/catshow.mp4 \ --m ~/ir/fp32/mobilenet-ssd.xml -d CPU +cd $INTEL_OPENVINO_DIR/deployment_tools/open_model_zoo/demos/object_detection_demo +./object_detection_demo -i ~/Videos/catshow.mp4 -m ~/ir/fp32/mobilenet-ssd.xml -d CPU ``` ## Advanced Demo Use diff --git a/docs/get_started/get_started_windows.md b/docs/get_started/get_started_windows.md index c8c7ee23d1f551..253af476efb186 100644 --- a/docs/get_started/get_started_windows.md +++ b/docs/get_started/get_started_windows.md @@ -96,6 +96,8 @@ The script: To run the script to perform inference on a CPU: +1. Open the `car.png` file in any image viewer to see what the demo will be classifying. +2. Run the following script: ```bat .\demo_squeezenet_download_convert_run.bat ``` @@ -167,10 +169,10 @@ The script:
Click for an example of running the Benchmark demo script -To run the script that performs inference on Intel® Vision Accelerator Design with Intel® Movidius™ VPUs: +To run the script that performs inference (runs on CPU by default): ```bat -.\demo_squeezenet_download_convert_run.bat -d HDDL +.\demo_benchmark_app.bat ``` When the verification script completes, you see the performance counters, resulting latency, and throughput values displayed on the screen.
@@ -482,6 +484,24 @@ Below you can find basic guidelines for executing the OpenVINO™ workflow using ## Typical Code Sample and Demo Application Syntax Examples +This section explains how to build and use the sample and demo applications provided with the toolkit. You will need CMake 3.10 or later and Microsoft Visual Studio 2017 or 2019 installed. Build details are on the [Inference Engine Samples](../IE_DG/Samples_Overview.md) and [Demo Applications](@ref omz_demos_README) pages. + +To build all the demos and samples: + +```sh +cd $INTEL_OPENVINO_DIR\inference_engine_samples\cpp +# to compile C samples, go here also: cd \inference_engine\samples\c +build_samples_msvc.bat +cd $INTEL_OPENVINO_DIR\deployment_tools\open_model_zoo\demos +build_demos_msvc.bat +``` + +Depending on what you compiled, executables are in the directories below: + +* `C:\Users\\Documents\Intel\OpenVINO\inference_engine_c_samples_build\intel64\Release` +* `C:\Users\\Documents\Intel\OpenVINO\inference_engine_cpp_samples_build\intel64\Release` +* `C:\Users\\Documents\Intel\OpenVINO\omz_demos_build\intel64\Release` + Template to call sample code or a demo application: ```bat diff --git a/docs/index.md b/docs/index.md index ee0739a1e1ecd4..15d0c2a3a2b31e 100644 --- a/docs/index.md +++ b/docs/index.md @@ -18,8 +18,8 @@ The following diagram illustrates the typical OpenVINO™ workflow (click to see ### Model Preparation, Conversion and Optimization -You can use your framework of choice to prepare and train a Deep Learning model or just download a pretrained model from the Open Model Zoo. The Open Model Zoo includes Deep Learning solutions to a variety of vision problems, including object recognition, face recognition, pose estimation, text detection, and action recognition, at a range of measured complexities. -Several of these pretrained models are used also in the [code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos). To download models from the Open Model Zoo, the [Model Downloader](@ref omz_tools_downloader) tool is used. +You can use your framework of choice to prepare and train a deep learning model or just download a pre-trained model from the Open Model Zoo. The Open Model Zoo includes deep learning solutions to a variety of vision problems, including object recognition, face recognition, pose estimation, text detection, and action recognition, at a range of measured complexities. +Several of these pre-trained models are used also in the [code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos_README). To download models from the Open Model Zoo, the [Model Downloader](@ref omz_tools_downloader_README) tool is used. One of the core component of the OpenVINO™ toolkit is the [Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) a cross-platform command-line tool that converts a trained neural network from its source framework to an open-source, nGraph-compatible [Intermediate Representation (IR)](MO_DG/IR_and_opsets.md) for use in inference operations. The Model Optimizer imports models trained in popular frameworks such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX* and performs a few optimizations to remove excess layers and group operations when possible into simpler, faster graphs. @@ -49,7 +49,7 @@ For a full browser-based studio integrating these other key tuning utilities, tr OpenVINO™ toolkit includes a set of [inference code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos) showing how inference is run and output processed for use in retail environments, classrooms, smart camera applications, and other solutions. -OpenVINO also makes use of open-Source and Intel™ tools for traditional graphics processing and performance management. Intel® Media SDK supports accelerated rich-media processing, including transcoding. OpenVINO™ optimizes calls to the rich OpenCV and OpenVX libraries for processing computer vision workloads. And the new DL Streamer integration further accelerates video pipelining and performance. +OpenVINO also makes use of open-source and Intel™ tools for traditional graphics processing and performance management. Intel® Media SDK supports accelerated rich-media processing, including transcoding. OpenVINO™ optimizes calls to the rich OpenCV and OpenVX libraries for processing computer vision workloads. And the new DL Streamer integration further accelerates video pipelining and performance. Useful documents for inference tuning: * [Inference Engine Developer Guide](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) @@ -82,22 +82,22 @@ The Inference Engine's plug-in architecture can be extended to meet other specia Intel® Distribution of OpenVINO™ toolkit includes the following components: -- [Deep Learning Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) - A cross-platform command-line tool for importing models and preparing them for optimal execution with the Inference Engine. The Model Optimizer imports, converts, and optimizes models, which were trained in popular frameworks, such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX*. -- [Deep Learning Inference Engine](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) - A unified API to allow high performance inference on many hardware types including Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ vision processing unit (VPU). -- [Inference Engine Samples](IE_DG/Samples_Overview.md) - A set of simple console applications demonstrating how to use the Inference Engine in your applications. -- [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) - A web-based graphical environment that allows you to easily use various sophisticated OpenVINO™ toolkit components. -- [Post-Training Optimization tool](@ref pot_README) - A tool to calibrate a model and then execute it in the INT8 precision. -- Additional Tools - A set of tools to work with your models including [Benchmark App](../inference-engine/tools/benchmark_tool/README.md), [Cross Check Tool](../inference-engine/tools/cross_check_tool/README.md), [Compile tool](../inference-engine/tools/compile_tool/README.md). +- [Deep Learning Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md): A cross-platform command-line tool for importing models and preparing them for optimal execution with the Inference Engine. The Model Optimizer imports, converts, and optimizes models, which were trained in popular frameworks, such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX*. +- [Deep Learning Inference Engine](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md): A unified API to allow high performance inference on many hardware types including Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ vision processing unit (VPU). +- [Inference Engine Samples](IE_DG/Samples_Overview.md): A set of simple console applications demonstrating how to use the Inference Engine in your applications. +- [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction): A web-based graphical environment that allows you to easily use various sophisticated OpenVINO™ toolkit components. +- [Post-Training Optimization tool](@ref pot_README): A tool to calibrate a model and then execute it in the INT8 precision. +- Additional Tools: A set of tools to work with your models including [Benchmark App](../inference-engine/tools/benchmark_tool/README.md), [Cross Check Tool](../inference-engine/tools/cross_check_tool/README.md), [Compile tool](../inference-engine/tools/compile_tool/README.md). - [Open Model Zoo](@ref omz_models_group_intel) - - [Demos](@ref omz_demos) - Console applications that provide robust application templates to help you implement specific deep learning scenarios. - - Additional Tools - A set of tools to work with your models including [Accuracy Checker Utility](@ref omz_tools_accuracy_checker) and [Model Downloader](@ref omz_tools_downloader). - - [Documentation for Pretrained Models](@ref omz_models_group_intel) - Documentation for pretrained models that are available in the [Open Model Zoo repository](https://github.com/opencv/open_model_zoo). -- Deep Learning Streamer (DL Streamer) – Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. DL Streamer can be installed by the Intel® Distribution of OpenVINO™ toolkit installer. Its open source version is available on [GitHub](https://github.com/opencv/gst-video-analytics). For the DL Streamer documentation, see: + - [Demos](@ref omz_demos): Console applications that provide robust application templates to help you implement specific deep learning scenarios. + - Additional Tools: A set of tools to work with your models including [Accuracy Checker Utility](@ref omz_tools_accuracy_checker) and [Model Downloader](@ref omz_tools_downloader). + - [Documentation for Pretrained Models](@ref omz_models_group_intel): Documentation for pre-trained models that are available in the [Open Model Zoo repository](https://github.com/opencv/open_model_zoo). +- Deep Learning Streamer (DL Streamer): Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. DL Streamer can be installed by the Intel® Distribution of OpenVINO™ toolkit installer. Its open-source version is available on [GitHub](https://github.com/opencv/gst-video-analytics). For the DL Streamer documentation, see: - [DL Streamer Samples](@ref gst_samples_README) - [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/) - [Elements](https://github.com/opencv/gst-video-analytics/wiki/Elements) - [Tutorial](https://github.com/opencv/gst-video-analytics/wiki/DL%20Streamer%20Tutorial) -- [OpenCV](https://docs.opencv.org/master/) - OpenCV* community version compiled for Intel® hardware +- [OpenCV](https://docs.opencv.org/master/) : OpenCV* community version compiled for Intel® hardware - [Intel® Media SDK](https://software.intel.com/en-us/media-sdk) (in Intel® Distribution of OpenVINO™ toolkit for Linux only) OpenVINO™ Toolkit opensource version is available on [GitHub](https://github.com/openvinotoolkit/openvino). For building the Inference Engine from the source code, see the build instructions. \ No newline at end of file diff --git a/docs/install_guides/PAC_Configure_2019RX.md b/docs/install_guides/PAC_Configure_2019RX.md index 5e43876ec20e00..150ca475d65a8e 100644 --- a/docs/install_guides/PAC_Configure_2019RX.md +++ b/docs/install_guides/PAC_Configure_2019RX.md @@ -45,7 +45,7 @@ cd a10_gx_pac_ias_1_2_pv_rte_installer 4. Select **Y** to install OPAE and accept license and when asked, specify `/home//tools/intelrtestack` as the absolute install path. During the installation there should be a message stating the directory already exists as it was created in the first command above. Select **Y** to install to this directory. If this message is not seen, it suggests that there was a typo when entering the install location. 5. Tools are installed to the following directories: - * OpenCL™ Run-time Environment: `~/tools/intelrtestack/opencl_rte/aclrte-linux64` + * OpenCL™ Runtime Environment: `~/tools/intelrtestack/opencl_rte/aclrte-linux64` * Intel® Acceleration Stack for FPGAs: `~/tools/intelrtestack/a10_gx_pac_ias_1_2_pv` 7. Check the version of the FPGA Interface Manager firmware on the PAC board. diff --git a/docs/install_guides/deployment-manager-tool.md b/docs/install_guides/deployment-manager-tool.md index 837ce3263e2e99..0989a3d5929c57 100644 --- a/docs/install_guides/deployment-manager-tool.md +++ b/docs/install_guides/deployment-manager-tool.md @@ -22,8 +22,7 @@ The Deployment Manager is a Python\* command-line tool that is delivered within ## Create Deployment Package Using Deployment Manager -There are two ways to create a deployment package that includes inference-related components of the OpenVINO™ toolkit:
-You can run the Deployment Manager tool in either Interactive or Standard CLI mode. +There are two ways to create a deployment package that includes inference-related components of the OpenVINO™ toolkit: you can run the Deployment Manager tool in either interactive or standard CLI mode. ### Run Interactive Mode
@@ -71,7 +70,7 @@ The following options are available: ``` * `[--output_dir]` — (Optional) Path to the output directory. By default, it set to your home directory. -* `[--archive_name]` — (Optional) Deployment archive name without extension. By default, it set to `openvino_deployment_package`. +* `[--archive_name]` — (Optional) Deployment archive name without extension. By default, it is set to `openvino_deployment_package`. * `[--user_data]` — (Optional) Path to a directory with user data (IRs, models, datasets, etc.) required for inference. By default, it's set to `None`, which means that the user data are already present on the target host machine. diff --git a/docs/install_guides/installing-openvino-linux.md b/docs/install_guides/installing-openvino-linux.md index 955a50a0bae8fb..a78fa8fc43d7a1 100644 --- a/docs/install_guides/installing-openvino-linux.md +++ b/docs/install_guides/installing-openvino-linux.md @@ -284,13 +284,10 @@ The steps in this section are required only if you want to enable the toolkit co ```sh cd /opt/intel/openvino_2021/install_dependencies/ ``` -2. Enter the super user mode: -```sh -sudo -E su -``` -3. Install the **Intel® Graphics Compute Runtime for OpenCL™** driver components required to use the GPU plugin and write custom layers for Intel® Integrated Graphics. The drivers are not included in the package, to install it, make sure you have the internet connection and run the installation script: + +2. Install the **Intel® Graphics Compute Runtime for OpenCL™** driver components required to use the GPU plugin and write custom layers for Intel® Integrated Graphics. The drivers are not included in the package, to install it, make sure you have the internet connection and run the installation script: ```sh -./install_NEO_OCL_driver.sh +sudo -E ./install_NEO_OCL_driver.sh ``` The script compares the driver version on the system to the current version. If the driver version on the system is higher or equal to the current version, the script does not install a new driver. If the version of the driver is lower than the current version, the script uninstalls the lower and installs the current version with your permission: diff --git a/docs/install_guides/installing-openvino-macos.md b/docs/install_guides/installing-openvino-macos.md index 0797d625ca8a16..d878eac5c3a84a 100644 --- a/docs/install_guides/installing-openvino-macos.md +++ b/docs/install_guides/installing-openvino-macos.md @@ -24,7 +24,7 @@ The following components are installed by default: | Component | Description | | :-------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) | This tool imports, converts, and optimizes models, which were trained in popular frameworks, to a format usable by Intel tools, especially the Inference Engine.
Popular frameworks include Caffe*, TensorFlow*, MXNet\*, and ONNX\*. | -| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) | This is the engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications. | +| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) | This is the engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications | | [OpenCV\*](https://docs.opencv.org/master/) | OpenCV\* community version compiled for Intel® hardware | | [Sample Applications](../IE_DG/Samples_Overview.md) | A set of simple console applications demonstrating how to use the Inference Engine in your applications. | | [Demos](@ref omz_demos) | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use-cases | @@ -59,10 +59,15 @@ The development and target platforms have the same requirements, but you can sel **Software Requirements** -- CMake 3.10 or higher -- Python 3.6 - 3.7 -- Apple Xcode\* Command Line Tools -- (Optional) Apple Xcode\* IDE (not required for OpenVINO, but useful for development) +* CMake 3.10 or higher + + [Install](https://cmake.org/download/) (choose "macOS 10.13 or later") + + Add `/Applications/CMake.app/Contents/bin` to path (for default install) +* Python 3.6 - 3.7 + + [Install](https://www.python.org/downloads/mac-osx/) (choose 3.6.x or 3.7.x, not latest) + + Add to path +* Apple Xcode\* Command Line Tools + + In the terminal, run `xcode-select --install` from any directory +* (Optional) Apple Xcode\* IDE (not required for OpenVINO, but useful for development) **Operating Systems** @@ -74,13 +79,13 @@ This guide provides step-by-step instructions on how to install the Intel® Dist The following steps will be covered: -1. Install the Intel® Distribution of OpenVINO™ Toolkit . +1. Install the Intel® Distribution of OpenVINO™ Toolkit. 2. Set the OpenVINO environment variables and (optional) Update to .bash_profile. 3. Configure the Model Optimizer. 4. Get Started with Code Samples and Demo Applications. 5. Uninstall the Intel® Distribution of OpenVINO™ Toolkit. -## Install the Intel® Distribution of OpenVINO™ toolkit Core Components +## Install the Intel® Distribution of OpenVINO™ Toolkit Core Components If you have a previous version of the Intel® Distribution of OpenVINO™ toolkit installed, rename or delete these two directories: @@ -125,15 +130,15 @@ The disk image is mounted to `/Volumes/m_openvino_toolkit_p_` and autom 9. If needed, click **Customize** to change the installation directory or the components you want to install: ![](../img/openvino-install-macos-04.png) > **NOTE**: If there is an OpenVINO™ toolkit version previously installed on your system, the installer will use the same destination directory for next installations. If you want to install a newer version to a different directory, you need to uninstall the previously installed versions. - Click **Next** to save the installation options and show the Installation summary screen. +10. Click **Next** to save the installation options and show the Installation summary screen. -10. On the **Installation summary** screen, press **Install** to begin the installation. +11. On the **Installation summary** screen, click **Install** to begin the installation. -11. When the first part of installation is complete, the final screen informs you that the core components have been installed +12. When the first part of installation is complete, the final screen informs you that the core components have been installed and additional steps still required: ![](../img/openvino-install-macos-05.png) -12. Click **Finish** to close the installation wizard. A new browser window opens to the next section of the Installation Guide to set the environment variables. If the installation did not indicate you must install dependencies, you can move ahead to [Set the Environment Variables](#set-the-environment-variables). If you received a message that you were missing external software dependencies, listed under **Software Requirements** at the top of this guide, you need to install them now before continuing on to the next section. +13. Click **Finish** to close the installation wizard. A new browser window opens to the next section of the Installation Guide to set the environment variables. If the installation did not indicate you must install dependencies, you can move ahead to [Set the Environment Variables](#set-the-environment-variables). If you received a message that you were missing external software dependencies, listed under **Software Requirements** at the top of this guide, you need to install them now before continuing on to the next section. ## Set the Environment Variables @@ -143,22 +148,26 @@ You need to update several environment variables before you can compile and run source /opt/intel/openvino_2021/bin/setupvars.sh ``` +If you didn't choose the default installation option, replace `/opt/intel/openvino_2021` with your directory. + Optional: The OpenVINO environment variables are removed when you close the shell. You can permanently set the environment variables as follows: 1. Open the `.bash_profile` file in the current user home directory: ```sh vi ~/.bash_profile ``` -2. Press the **i** key to switch to the insert mode. +2. Press the **i** key to switch to insert mode. 3. Add this line to the end of the file: ```sh source /opt/intel/openvino_2021/bin/setupvars.sh ``` -3. Save and close the file: press the **Esc** key, type `:wq` and press the **Enter** key. +If you didn't choose the default installation option, replace `/opt/intel/openvino_2021` with your directory. + +4. Save and close the file: press the **Esc** key, type `:wq` and press the **Enter** key. -4. To verify your change, open a new terminal. You will see `[setupvars.sh] OpenVINO environment initialized`. +5. To verify your change, open a new terminal. You will see `[setupvars.sh] OpenVINO environment initialized`. The environment variables are set. Continue to the next section to configure the Model Optimizer. @@ -264,13 +273,13 @@ Proceed to the Get Started to get started with runnin Now you are ready to get started. To continue, see the following pages: * [OpenVINO™ Toolkit Overview](../index.md) -* [Get Started Guide for Windows](../get_started/get_started_macos.md) to learn the basic OpenVINO™ toolkit workflow and run code samples and demo applications with pre-trained models on different inference devices. +* [Get Started Guide for macOS](../get_started/get_started_macos.md) to learn the basic OpenVINO™ toolkit workflow and run code samples and demo applications with pre-trained models on different inference devices. ## Uninstall the Intel® Distribution of OpenVINO™ Toolkit Follow the steps below to uninstall the Intel® Distribution of OpenVINO™ Toolkit from your system: -1. From the ``, locate and open `openvino_toolkit_uninstaller.app`. +1. From the the installation directory (by default, `/opt/intel/openvino_2021`), locate and open `openvino_toolkit_uninstaller.app`. 2. Follow the uninstallation wizard instructions. 3. When uninstallation is complete, click **Finish**. diff --git a/docs/install_guides/installing-openvino-raspbian.md b/docs/install_guides/installing-openvino-raspbian.md index 0695ef9e772ca9..14b354532e1ca7 100644 --- a/docs/install_guides/installing-openvino-raspbian.md +++ b/docs/install_guides/installing-openvino-raspbian.md @@ -10,7 +10,11 @@ The OpenVINO™ toolkit quickly deploys applications and solutions that emulate human vision. Based on Convolutional Neural Networks (CNN), the toolkit extends computer vision (CV) workloads across Intel® hardware, maximizing performance. The OpenVINO toolkit includes the Intel® Deep Learning Deployment Toolkit (Intel® DLDT). -The OpenVINO™ toolkit for Raspbian* OS includes the Inference Engine and the MYRIAD plugins. You can use it with the Intel® Neural Compute Stick 2 plugged in one of USB ports. +The OpenVINO™ toolkit for Raspbian* OS includes the Inference Engine and the MYRIAD plugins. You can use it with the Intel® Neural Compute Stick 2 plugged into one of USB ports. This device is required for using the Intel® Distribution of OpenVINO™ toolkit. + +> **NOTE**: There is also an open-source version of OpenVINO™ that can be compiled for arch64 (see [build instructions](https://github.com/openvinotoolkit/openvino/wiki/BuildingForRaspbianStretchOS)). + +Because OpenVINO for Raspbian* OS doesn't include Model Optimizer, the ideal scenario is to use another machine to convert your model with Model Optimizer, then do your application development on the Raspberry Pi* for a convenient build/test cycle on the target platform. ### Included in the Installation Package @@ -31,10 +35,9 @@ The OpenVINO toolkit for Raspbian OS is an archive with pre-installed header fil **Hardware** - Raspberry Pi\* board with ARM* ARMv7-A CPU architecture. Check that `uname -m` returns `armv7l`. -- One of Intel® Movidius™ Visual Processing Units (VPU): -- Intel® Neural Compute Stick 2 +- Intel® Neural Compute Stick 2, which as one of the Intel® Movidius™ Visual Processing Units (VPUs) -> **NOTE**: With OpenVINO™ 2020.4 release, Intel® Movidius™ Neural Compute Stick is no longer supported. +> **NOTE**: With OpenVINO™ 2020.4 release, Intel® Movidius™ Neural Compute Stick (1) is no longer supported. **Operating Systems** @@ -62,7 +65,7 @@ This guide provides step-by-step instructions on how to install the OpenVINO™ The guide assumes you downloaded the OpenVINO toolkit for Raspbian* OS. If you do not have a copy of the toolkit package file `l_openvino_toolkit_runtime_raspbian_p_.tgz`, download the latest version from the [OpenVINO™ Toolkit packages storage](https://storage.openvinotoolkit.org/repositories/openvino/packages/) and then return to this guide to proceed with the installation. -> **NOTE**: The OpenVINO toolkit for Raspbian OS is distributed without installer, so you need to perform extra steps comparing to the [Intel® Distribution of OpenVINO™ toolkit for Linux* OS](installing-openvino-linux.md). +> **NOTE**: The OpenVINO toolkit for Raspbian OS is distributed without an installer, so you need to perform some extra steps compared to the [Intel® Distribution of OpenVINO™ toolkit for Linux* OS](installing-openvino-linux.md). 1. Open the Terminal\* or your preferred console application. 2. Go to the directory in which you downloaded the OpenVINO toolkit. This document assumes this is your `~/Downloads` directory. If not, replace `~/Downloads` with the directory where the file is located. @@ -107,9 +110,8 @@ To test your change, open a new terminal. You will see the following: [setupvars.sh] OpenVINO environment initialized ``` -Continue to the next section to add USB rules for Intel® Neural Compute Stick 2 devices. - -## Add USB Rules +## Add USB Rules for an Intel® Neural Compute Stick 2 device +This task applies only if you have an Intel® Neural Compute Stick 2 device. 1. Add the current Linux user to the `users` group: ```sh @@ -126,11 +128,11 @@ Continue to the next section to add USB rules for Intel® Neural Compute Stick 2 ``` 4. Plug in your Intel® Neural Compute Stick 2. -You are ready to compile and run the Object Detection sample to verify the Inference Engine installation. +You are now ready to compile and run the Object Detection sample to verify the Inference Engine installation. ## Build and Run Object Detection Sample -Follow the next steps to run pre-trained Face Detection network using Inference Engine samples from the OpenVINO toolkit. +Follow the next steps to use the pre-trained face detection model using Inference Engine samples from the OpenVINO toolkit. 1. Navigate to a directory that you have write access to and create a samples build directory. This example uses a directory named `build`: ```sh @@ -150,7 +152,7 @@ Follow the next steps to run pre-trained Face Detection network using Inference python3 -m pip install -r requirements.in python3 downloader.py --name face-detection-adas-0001 ``` -4. Run the sample with specifying the model and a path to the input image: +4. Run the sample specifying the model, a path to the input image, and the VPU required to run with the Raspbian* OS: ```sh ./armv7l/Release/object_detection_sample_ssd -m face-detection-adas-0001.xml -d MYRIAD -i ``` diff --git a/docs/install_guides/installing-openvino-windows.md b/docs/install_guides/installing-openvino-windows.md index 56e963d1ea40e8..1a1a31a07c61fe 100644 --- a/docs/install_guides/installing-openvino-windows.md +++ b/docs/install_guides/installing-openvino-windows.md @@ -31,7 +31,7 @@ Your installation is complete when these are all completed: - Install the drivers and software for the Intel® Vision Accelerator Design with Intel® Movidius™ VPUs - - Update Windows* environment variables + - Update Windows* environment variables (necessary if you didn't choose the option to add Python to the path when you installed Python) Also, the following steps will be covered in the guide: - Get Started with Code Samples and Demo Applications @@ -246,7 +246,7 @@ Or proceed to the Get Started to get started with run ### Optional: Additional Installation Steps for Intel® Processor Graphics (GPU) -> **NOTE**: These steps are required only if you want to use a GPU. +> **NOTE**: These steps are required only if you want to use an Intel® integrated GPU. If your applications offload computation to **Intel® Integrated Graphics**, you must have the latest version of Intel Graphics Driver for Windows installed for your hardware. [Download and install a higher version](http://downloadcenter.intel.com/product/80939/Graphics-Drivers). @@ -277,7 +277,7 @@ To perform inference on Intel® Vision Accelerator Design with Intel® Movidius 1. Download and install Visual C++ Redistributable for Visual Studio 2017 2. Check with a support engineer if your Intel® Vision Accelerator Design with Intel® Movidius™ VPUs card requires SMBUS connection to PCIe slot (most unlikely). Install the SMBUS driver only if confirmed (by default, it's not required): - 1. Go to the `\deployment_tools\inference-engine\external\hddl\SMBusDriver` directory, where `` is the directory in which the Intel Distribution of OpenVINO toolkit is installed. + 1. Go to the `\deployment_tools\inference-engine\external\hddl\drivers\SMBusDriver` directory, where `` is the directory in which the Intel Distribution of OpenVINO toolkit is installed. 2. Right click on the `hddlsmbus.inf` file and choose **Install** from the pop up menu. You are done installing your device driver and are ready to use your Intel® Vision Accelerator Design with Intel® Movidius™ VPUs. @@ -313,7 +313,7 @@ Use these steps to update your Windows `PATH` if a command you execute returns a 7. Click **OK** repeatedly to close each screen. -Your `PATH` environment variable is updated. +Your `PATH` environment variable is updated. If the changes don't take effect immediately, you may need to reboot. ## Get Started diff --git a/docs/nGraph_DG/intro.md b/docs/nGraph_DG/intro.md index 032ecc8f610ef9..f096321ef5db53 100644 --- a/docs/nGraph_DG/intro.md +++ b/docs/nGraph_DG/intro.md @@ -11,17 +11,17 @@ Operations from these operation sets are generated by the Model Optimizer and ar 2. Operation version is attached to each operation rather than to the entire IR file format. IR is still versioned but has a different meaning. For details, see [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../MO_DG/IR_and_opsets.md). -3. Creating models in run-time without loading IR from an xml/binary file. You can enable it by creating +3. Creating models at runtime without loading IR from an xml/binary file. You can enable it by creating `ngraph::Function` passing it to `CNNNetwork`. -4. Run-time reshape capability and constant folding are implemented through the nGraph code for more operations compared to previous releases. +4. Runtime reshape capability and constant folding are implemented through the nGraph code for more operations compared to previous releases. As a result, more models can be reshaped. For details, see the [dedicated guide about the reshape capability](../IE_DG/ShapeInference.md). 5. Loading [model from ONNX format](../IE_DG/ONNX_Support.md) without converting it to the Inference Engine IR. 6. nGraph representation supports dynamic shapes. You can use `CNNNetwork::reshape()` method in order to specialize input shapes. -The complete picture of existed flow is presented below. +A complete picture of the existing flow is shown below. ![](img/TopLevelNGraphFlow.png) diff --git a/docs/nGraph_DG/nGraphTransformation.md b/docs/nGraph_DG/nGraphTransformation.md index 5e88ccdf12cd51..e46f0dd8a02c6a 100644 --- a/docs/nGraph_DG/nGraphTransformation.md +++ b/docs/nGraph_DG/nGraphTransformation.md @@ -27,7 +27,7 @@ Transformation flow in the transformation library has several layers: 2. Transformations - Perform a particular transformation algorithm on `ngraph::Function`. 3. Low-level functions - Take a set of nodes and perform some transformation action. They are not mandatory and all transformation code can be located inside the transformation. -But if some transformation parts can potentially be reused in other transformations, we suggest keeping them as a separate functions. +But if some transformation parts can potentially be reused in other transformations, we suggest keeping them as separate functions. ### Location for Your Transformation Code To decide where to store your transformation code, please follow these rules: diff --git a/docs/nGraph_DG/nGraph_basic_concepts.md b/docs/nGraph_DG/nGraph_basic_concepts.md index 2d6bed7027258f..4648c2613ebc2f 100644 --- a/docs/nGraph_DG/nGraph_basic_concepts.md +++ b/docs/nGraph_DG/nGraph_basic_concepts.md @@ -4,8 +4,8 @@ The nGraph represents neural networks in uniform format. User can create differe ## nGraph Function and Graph Representation -nGraph function is a very simple thing: it stores shared pointers to `ngraph::op::Parameter`, `ngraph::op::Result` and `ngraph::op::Sink` operations that are inputs, outputs and sinks of the graph. -Sinks of the graph have no consumers and not included into results vector. All other operations hold each other via shared pointers: child operation holds its parent (hard link). If operation has no consumers and it's not Result or Sink operation +nGraph function is a very simple thing: it stores shared pointers to `ngraph::op::Parameter`, `ngraph::op::Result` and `ngraph::op::Sink` operations that are inputs, outputs and sinks of the graph. +Sinks of the graph have no consumers and are not included in the results vector. All other operations hold each other via shared pointers: child operation holds its parent (hard link). If operation has no consumers and it's not Result or Sink operation (shared pointer counter is zero) then it will be destructed and won't be accessible anymore. Each operation in `ngraph::Function` has a `std::shared_ptr` type. For details on how to build an nGraph Function, see the [Build nGraph Function](./build_function.md) page. diff --git a/docs/optimization_guide/dldt_optimization_guide.md b/docs/optimization_guide/dldt_optimization_guide.md index 87fb3d26b4d437..58c4ba57064c19 100644 --- a/docs/optimization_guide/dldt_optimization_guide.md +++ b/docs/optimization_guide/dldt_optimization_guide.md @@ -275,7 +275,7 @@ The following tips are provided to give general guidance on optimizing execution - The general affinity “rule of thumb” is to keep computationally-intensive kernels on the accelerator, and "glue" (or helper) kernels on the CPU. Notice that this includes the granularity considerations. For example, running some (custom) activation on the CPU would result in too many conversions. -- It is advised to do performance analysis to determine “hotspot” kernels, which should be the first candidates for offloading. At the same time, it is often more efficient to offload some reasonably sized sequence of kernels, rather than individual kernels, to minimize scheduling and other run-time overheads. +- It is advised to do performance analysis to determine “hotspot” kernels, which should be the first candidates for offloading. At the same time, it is often more efficient to offload some reasonably sized sequence of kernels, rather than individual kernels, to minimize scheduling and other runtime overhead. - Notice that GPU can be busy with other tasks (like rendering). Similarly, the CPU can be in charge for the general OS routines and other application threads (see Note on the App-Level Threading). Also, a high interrupt rate due to many subgraphs can raise the frequency of the one device and drag the frequency of another down. diff --git a/docs/ovsa/ovsa_get_started.md b/docs/ovsa/ovsa_get_started.md index 19678297eb74d6..9d19ee63eb1253 100644 --- a/docs/ovsa/ovsa_get_started.md +++ b/docs/ovsa/ovsa_get_started.md @@ -152,7 +152,7 @@ You're ready to configure the Host Machine for networking. This step is for the combined Model Developer and Independent Software Vendor roles. If Model User VM is running on different physical host, repeat the following steps for that host also. In this step you prepare two network bridges: -* A global IP address that a KVM can access across the Internet. This is the address that the OpenVINO™ Security Add-on Run-time software on a user's machine uses to verify they have a valid license. +* A global IP address that a KVM can access across the Internet. This is the address that the OpenVINO™ Security Add-on runtime software on a user's machine uses to verify they have a valid license. * A host-only local address to provide communication between the Guest VM and the QEMU host operating system. This example in this step uses the following names. Your configuration might use different names: diff --git a/inference-engine/ie_bridges/c/samples/object_detection_sample_ssd/README.md b/inference-engine/ie_bridges/c/samples/object_detection_sample_ssd/README.md index 7370b6ab61f006..727d39ab2702f6 100644 --- a/inference-engine/ie_bridges/c/samples/object_detection_sample_ssd/README.md +++ b/inference-engine/ie_bridges/c/samples/object_detection_sample_ssd/README.md @@ -76,7 +76,7 @@ Options: > > - The sample accepts models in ONNX format (.onnx) that do not require preprocessing. -For example, to do inference on a CPU with the OpenVINO™ toolkit person detection SSD models, run one of the following commands: +For example, to perform inference on a CPU with the OpenVINO™ toolkit person detection SSD models, run one of the following commands: - with one image and [person-detection-retail-0013](https://docs.openvinotoolkit.org/latest/omz_models_intel_person_detection_retail_0013_description_person_detection_retail_0013.html) model diff --git a/inference-engine/ie_bridges/python/docs/api_overview.md b/inference-engine/ie_bridges/python/docs/api_overview.md index a2fbea2ea58ad0..577edcc080c181 100644 --- a/inference-engine/ie_bridges/python/docs/api_overview.md +++ b/inference-engine/ie_bridges/python/docs/api_overview.md @@ -8,9 +8,10 @@ This API provides a simplified interface for Inference Engine functionality that ## Supported OSes -Inference Engine Python\* API is supported on Ubuntu\* 18.04 and 20.04, CentOS\* 7.3 OSes, Raspbian\* 9, Windows\* 10 -and macOS\* 10.x. -Supported Python* versions: +Inference Engine Python\* API is supported on Ubuntu\* 18.04 and 20.04, CentOS\* 7.3 OSes, Raspbian\* 9, Windows\* 10 +and macOS\* 10.x. + +Supported Python* versions: | Operating System | Supported Python\* versions: | |:----- | :----- | @@ -18,8 +19,8 @@ Supported Python* versions: | Ubuntu\* 20.04 | 3.6, 3.7, 3.8 | | Windows\* 10 | 3.6, 3.7, 3.8 | | CentOS\* 7.3 | 3.6, 3.7 | -| macOS\* 10.x | 3.6, 3.7 | -| Raspbian\* 9 | 3.6, 3.7 | +| macOS\* 10.x | 3.6, 3.7 | +| Raspbian\* 9 | 3.6, 3.7 | ## Set Up the Environment @@ -31,7 +32,7 @@ To configure the environment for the Inference Engine Python\* API, run: * On Raspbian\* 9,: `source /bin/setupvars.sh .` * On Windows\* 10: `call \bin\setupvars.bat` -The script automatically detects latest installed Python\* version and configures required environment if the version is supported. +The script automatically detects latest installed Python\* version and configures required environment if the version is supported. If you want to use certain version of Python\*, set the environment variable `PYTHONPATH=/python/` after running the environment configuration script. diff --git a/inference-engine/ie_bridges/python/sample/hello_classification/README.md b/inference-engine/ie_bridges/python/sample/hello_classification/README.md index 4003a81ee16200..d662a94a2635ca 100644 --- a/inference-engine/ie_bridges/python/sample/hello_classification/README.md +++ b/inference-engine/ie_bridges/python/sample/hello_classification/README.md @@ -27,7 +27,7 @@ each sample step at [Integration Steps](../../../../../docs/IE_DG/Integrate_with ## Running -Run the application with the -h option to see the usage message: +Run the application with the `-h` option to see the usage message: ```sh python hello_classification.py -h @@ -68,7 +68,7 @@ To run the sample, you need specify a model and image: > > - The sample accepts models in ONNX format (.onnx) that do not require preprocessing. -You can do inference of an image using a pre-trained model on a GPU using the following command: +For example, to perform inference of an image using a pre-trained model on a GPU, run the following command: ```sh python hello_classification.py -m /alexnet.xml -i /cat.bmp -d GPU diff --git a/inference-engine/ie_bridges/python/sample/hello_query_device/README.md b/inference-engine/ie_bridges/python/sample/hello_query_device/README.md index 35e84bc23ed986..af4784ebc384f0 100644 --- a/inference-engine/ie_bridges/python/sample/hello_query_device/README.md +++ b/inference-engine/ie_bridges/python/sample/hello_query_device/README.md @@ -1,6 +1,6 @@ # Hello Query Device Python* Sample {#openvino_inference_engine_ie_bridges_python_sample_hello_query_device_README} -This sample demonstrates how to show Inference Engine devices and prints their metrics and default configuration values, using [Query Device API feature](../../../../../docs/IE_DG/InferenceEngine_QueryAPI.md). +This sample demonstrates how to show Inference Engine devices and prints their metrics and default configuration values using [Query Device API feature](../../../../../docs/IE_DG/InferenceEngine_QueryAPI.md). The following Inference Engine Python API is used in the application: @@ -28,7 +28,7 @@ python hello_query_device.py ## Sample Output -For example: +The application prints all available devices with their supported metrics and default values for configuration parameters. (Some lines are not shown due to length.) For example: ```sh [ INFO ] Creating Inference Engine @@ -101,7 +101,6 @@ For example: [ INFO ] TUNING_MODE: TUNING_DISABLED [ INFO ] ``` - ## See Also - [Using Inference Engine Samples](../../../../../docs/IE_DG/Samples_Overview.md) diff --git a/inference-engine/ie_bridges/python/sample/object_detection_sample_ssd/README.md b/inference-engine/ie_bridges/python/sample/object_detection_sample_ssd/README.md index 17a5640ccf3505..b2638d78dac571 100644 --- a/inference-engine/ie_bridges/python/sample/object_detection_sample_ssd/README.md +++ b/inference-engine/ie_bridges/python/sample/object_detection_sample_ssd/README.md @@ -21,7 +21,7 @@ Basic Inference Engine API is covered by [Hello Classification Python* Sample](. ## How It Works -At startup, the sample application reads command-line parameters, prepares input data, loads a specified model and image to the Inference Engine plugin, performs synchronous inference, and processes output data. +On startup, the sample application reads command-line parameters, prepares input data, loads a specified model and image to the Inference Engine plugin, performs synchronous inference, and processes output data. As a result, the program creates an output image, logging each step in a standard output stream. You can see the explicit description of diff --git a/inference-engine/samples/speech_libs_and_demos/Offline_speech_recognition_demo.md b/inference-engine/samples/speech_libs_and_demos/Offline_speech_recognition_demo.md index 71e1d693e1fa6c..10594e2c321e1c 100644 --- a/inference-engine/samples/speech_libs_and_demos/Offline_speech_recognition_demo.md +++ b/inference-engine/samples/speech_libs_and_demos/Offline_speech_recognition_demo.md @@ -1,9 +1,9 @@ # Offline Speech Recognition Demo {#openvino_inference_engine_samples_speech_libs_and_demos_Offline_speech_recognition_demo} -This demo provides a command-line interface for automatic speech recognition using OpenVINO™. +This demo provides a command-line interface for automatic speech recognition using OpenVINO™. Components used by this executable: -* `lspeech_s5_ext` model - Example pretrained LibriSpeech DNN +* `lspeech_s5_ext` model - Example pre-trained LibriSpeech DNN * `speech_library.dll` (`.so`) - Open source speech recognition library that uses OpenVINO™ Inference Engine, Intel® Speech Feature Extraction and Intel® Speech Decoder libraries ## How It Works @@ -87,4 +87,4 @@ The resulting transcription for the sample audio file: [ INFO ] Model loading time: 61.01 ms Recognition result: HOW ARE YOU DOING -``` \ No newline at end of file +``` diff --git a/inference-engine/samples/speech_libs_and_demos/Speech_libs_and_demos.md b/inference-engine/samples/speech_libs_and_demos/Speech_libs_and_demos.md index 212ffb26f19fba..5bd8b99d82a6e6 100644 --- a/inference-engine/samples/speech_libs_and_demos/Speech_libs_and_demos.md +++ b/inference-engine/samples/speech_libs_and_demos/Speech_libs_and_demos.md @@ -34,9 +34,9 @@ The package contains the following components: Additionally, new acoustic and language models are available in the OpenVINO™ [storage](https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/librispeech_s5/). -## Run Speech Recognition Demos with Pretrained Models +## Run Speech Recognition Demos with Pre-trained Models -To download pretrained models and build all dependencies: +To download pre-trained models and build all dependencies: * On Linux* OS, use the shell script `/deployment_tools/demo/demo_speech_recognition.sh` @@ -67,9 +67,9 @@ set https_proxy=https://{proxyHost}:{proxyPort} ## Hardware Support -The provided acoustic models have been tested on a CPU, graphics processing unit (GPU), and Intel® Gaussian & Neural Accelerator (Intel® GNA), and you can switch between these targets in offline and live speech recognition demos. +The provided acoustic models have been tested on a CPU, graphics processing unit (GPU), and Intel® Gaussian & Neural Accelerator (Intel® GNA), and you can switch between these targets in offline and live speech recognition demos. -> **NOTE**: Intel® GNA is a specific low-power coprocessor, which offloads some workloads, thus saving power and CPU resources. If you use a processor supporting the GNA, such as Intel® Core™ i3-8121U and Intel® Core™ i7-1065G7, you can notice that CPU load is much lower when GNA is selected. If you selected GNA as a device for inference, and your processor does not support GNA, then execution is performed in the emulation mode (on CPU) because `GNA_AUTO` configuration option is used. +> **NOTE**: Intel® GNA is a specific low-power coprocessor, which offloads some workloads, thus saving power and CPU resources. If you use a processor supporting the GNA, such as Intel® Core™ i3-8121U and Intel® Core™ i7-1065G7, you can notice that CPU load is much lower when GNA is selected. If you selected GNA as a device for inference, and your processor does not support GNA, then execution is performed in the emulation mode (on CPU) because `GNA_AUTO` configuration option is used. > See [the GNA plugin documentation](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_supported_plugins_GNA.html) for more information. Speech Library provides a highly optimized implementation of preprocessing and postprocessing (feature extraction and decoding) on CPU only. @@ -78,7 +78,7 @@ Speech Library provides a highly optimized implementation of preprocessing and p Before running demonstration applications with custom models, follow the steps below: -1. Build the Speech Library and demonstration application using the `demo_speech_recognition.sh/.bat` file mentioned in Run Speech Recognition Demos with Pretrained Models +1. Build the Speech Library and demonstration application using the `demo_speech_recognition.sh/.bat` file mentioned in Run Speech Recognition Demos with Pre-trained Models 2. Train acoustic and statistical language models using the Kaldi framework (if required) 3. [Convert the acoustic model](../../../docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md) using Model Optimizer for Kaldi 4. [Convert the language model](Kaldi_SLM_conversion_tool.md) using the Kaldi toolkit and provided converter diff --git a/inference-engine/samples/speech_sample/README.md b/inference-engine/samples/speech_sample/README.md index 5d10f81c6e5b71..91365bd6c60dde 100644 --- a/inference-engine/samples/speech_sample/README.md +++ b/inference-engine/samples/speech_sample/README.md @@ -24,7 +24,7 @@ Basic Inference Engine API is covered by [Hello Classification C++ sample](../he |:--- |:--- | Validated Models | Acoustic model based on Kaldi\* neural networks (see [Model Preparation](#model-preparation) section) | Model Format | Inference Engine Intermediate Representation (\*.xml + \*.bin), ONNX (\*.onnx) -| Supported devices | See [Execution Modes section](#execution-modes) below and [List Supported Devices](../../../docs/IE_DG/supported_plugins/Supported_Devices.md) | +| Supported devices | See [Execution Modes](#execution-modes) section below and [List Supported Devices](../../../docs/IE_DG/supported_plugins/Supported_Devices.md) | ## How It Works @@ -61,14 +61,14 @@ will be removed in GNA hardware version 3 and higher. Several execution modes are supported via the `-d` flag: -- `CPU` - all calculation will be performed on CPU device using CPU Plugin. -- `GPU` - all calculation will be performed on GPU device using GPU Plugin. -- `MYRIAD` - all calculation will be performed on Intel® Neural Compute Stick 2 device using VPU MYRIAD Plugin. -- `GNA_AUTO` - the GNA hardware is used if available and the driver is installed. Otherwise, the GNA device is emulated in fast-but-not-bit-exact mode. -- `GNA_HW` - the GNA hardware is used if available and the driver is installed. Otherwise, an error will occur. -- `GNA_SW` - deprecated. The GNA device is emulated in fast-but-not-bit-exact mode. -- `GNA_SW_FP32` - substitutes parameters and calculations from low precision to floating point (FP32). -- `GNA_SW_EXACT` - the GNA device is emulated in bit-exact mode. +- `CPU` - All calculation are performed on CPU device using CPU Plugin. +- `GPU` - All calculation are performed on GPU device using GPU Plugin. +- `MYRIAD` - All calculation are performed on Intel® Neural Compute Stick 2 device using VPU MYRIAD Plugin. +- `GNA_AUTO` - GNA hardware is used if available and the driver is installed. Otherwise, the GNA device is emulated in fast-but-not-bit-exact mode. +- `GNA_HW` - GNA hardware is used if available and the driver is installed. Otherwise, an error will occur. +- `GNA_SW` - Deprecated. The GNA device is emulated in fast-but-not-bit-exact mode. +- `GNA_SW_FP32` - Substitutes parameters and calculations from low precision to floating point (FP32). +- `GNA_SW_EXACT` - GNA device is emulated in bit-exact mode. #### Loading and Saving Models @@ -137,7 +137,7 @@ Running the application with the empty list of options yields the usage message You can use the following model optimizer command to convert a Kaldi nnet1 or nnet2 neural network to Inference Engine Intermediate Representation format: ```sh -python mo.py --framework kaldi --input_model wsj_dnn5b.nnet --counts wsj_dnn5b.counts --remove_output_softmax +python mo.py --framework kaldi --input_model wsj_dnn5b.nnet --counts wsj_dnn5b.counts --remove_output_softmax --output_dir ``` Assuming that the model optimizer (`mo.py`), Kaldi-trained neural network, `wsj_dnn5b.nnet`, and Kaldi class counts file, `wsj_dnn5b.counts`, are in the working directory this produces @@ -153,14 +153,16 @@ All of them can be downloaded from [https://storage.openvinotoolkit.org/models_c ### Speech Inference -Once the IR is created, you can use the following command to do inference on Intel^® Processors with the GNA co-processor (or emulation library): +Once the IR is created, you can use the following command to do inference on Intel® Processors with the GNA co-processor (or emulation library): ```sh ./speech_sample -d GNA_AUTO -bs 2 -i dev93_10.ark -m wsj_dnn5b.xml -o scores.ark -r dev93_scores_10.ark ``` Here, the floating point Kaldi-generated reference neural network scores (`dev93_scores_10.ark`) corresponding to the input feature file (`dev93_10.ark`) are assumed to be available -for comparison. All of them can be downloaded from [https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/wsj_dnn5b_smbr](https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/wsj_dnn5b_smbr). Inference Engine Intermediate Representation `wsj_dnn5b.xml` file was generated in the [previous Model preparation section](#model-preparation). +for comparison. + +All of them can be downloaded from [https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/wsj_dnn5b_smbr](https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/wsj_dnn5b_smbr). Inference Engine Intermediate Representation `wsj_dnn5b.xml` file was generated in the previous [Model Preparation](#model-preparation) section. > **NOTES**: > @@ -230,8 +232,7 @@ nnet-forward --use-gpu=no final.feature_transform "ark,s,cs:copy-feats scp:feats ```sh ./speech_sample -d GNA_AUTO -bs 8 -i feat.ark -m wsj_dnn5b.xml -o scores.ark ``` - -Inference Engine Intermediate Representation `wsj_dnn5b.xml` file was generated in the [previous Model preparation section](#model-preparation). +Inference Engine Intermediate Representation `wsj_dnn5b.xml` file was generated in the previous [Model Preparation](#model-preparation) section. 3. Run the Kaldi decoder to produce n-best text hypotheses and select most likely text given the WFST (`HCLG.fst`), vocabulary (`words.txt`), and TID/PID mapping (`final.mdl`): diff --git a/inference-engine/thirdparty/fluid/modules/gapi/doc/10-hld-overview.md b/inference-engine/thirdparty/fluid/modules/gapi/doc/10-hld-overview.md index 557bf08b12e458..6de6efa9216ee5 100644 --- a/inference-engine/thirdparty/fluid/modules/gapi/doc/10-hld-overview.md +++ b/inference-engine/thirdparty/fluid/modules/gapi/doc/10-hld-overview.md @@ -142,7 +142,7 @@ Graph execution is triggered in two ways: Both methods are polimorphic and take a variadic number of arguments, with validity checks performed in runtime. If a number, shapes, and -formats of passed data objects differ from expected, a run-time +formats of passed data objects differ from expected, a runtime exception is thrown. G-API also provides _typed_ wrappers to move these checks to the compile time -- see `cv::GComputationT<>`. diff --git a/inference-engine/tools/compile_tool/README.md b/inference-engine/tools/compile_tool/README.md index 14b72bb6299f45..0b083e15dc1e1a 100644 --- a/inference-engine/tools/compile_tool/README.md +++ b/inference-engine/tools/compile_tool/README.md @@ -1,13 +1,13 @@ # Compile Tool {#openvino_inference_engine_tools_compile_tool_README} -Compile tool is a C++ application that enables you to compile a network for inference on a specific device and export it to a binary file. +Compile tool is a C++ application that enables you to compile a network for inference on a specific device and export it to a binary file. With the Compile Tool, you can compile a network using supported Inference Engine plugins on a machine that doesn't have the physical device connected and then transfer a generated file to any machine with the target inference device available. The tool compiles networks for the following target devices using corresponding Inference Engine plugins: * Intel® Neural Compute Stick 2 (MYRIAD plugin) -> **NOTE**: Intel® Distribution of OpenVINO™ toolkit no longer supports the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. To compile a network for those devices, use the Compile Tool from the Intel® Distribution of OpenVINO™ toolkit [2020.3 LTS release](https://docs.openvinotoolkit.org/2020.3/_inference_engine_tools_compile_tool_README.html). +> **NOTE**: Intel® Distribution of OpenVINO™ toolkit no longer supports the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. To compile a network for those devices, use the Compile Tool from the Intel® Distribution of OpenVINO™ toolkit [2020.3 LTS release](https://docs.openvinotoolkit.org/2020.3/_inference_engine_tools_compile_tool_README.html). The tool is delivered as an executable file that can be run on both Linux* and Windows*. @@ -15,7 +15,7 @@ The tool is located in the `/deployment_tools/tools/compile_tool` d The workflow of the Compile tool is as follows: -1. Upon the start, the tool application reads command-line parameters and loads a network to the Inference Engine device. +1. First, the application reads command-line parameters and loads a network to the Inference Engine device. 2. The application exports a blob with the compiled network and writes it to the output file. ## Run the Compile Tool