diff --git a/LICENSE b/LICENSE index 0fbe7ec..aa28235 100644 --- a/LICENSE +++ b/LICENSE @@ -1,4 +1,4 @@ -// Copyright 2020 Rockchip Electronics Co.,Ltd. +// Copyright 2022 Rockchip Electronics Co.,Ltd. // All rights reserved. // // Redistribution and use in source and binary forms, with or without diff --git a/README.md b/README.md index 8cf0d55..fb40727 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ # Introduction -RKNN-Toolkit2 is a software development kit for users to perform model conversion, inference and performance evaluation on PC and Rockchip NPU platforms (RK3566, RK3568, RK3588, RK3588S, RV1103, RV1106). +RKNN-Toolkit2 is a software development kit for users to perform model conversion, inference and performance evaluation on PC and Rockchip NPU platforms (RK3566, RK3568, RK3588, RK3588S, RV1103, RV1106, RK3562). RKNN-Toolkit-Lite2 provides Python programming interfaces for Rockchip NPU platform (RK3566, RK3568, RK3588, RK3588S) to help users deploy RKNN models and accelerate the implementation of AI applications. @@ -22,15 +22,14 @@ Note: # Notes - Currently rknn-toolkit2 is not compatible with [rknn-toolkit](https://github.com/rockchip-linux/rknn-toolkit) -- Currently only support on Ubuntu 18.04 python 3.6 / Ubuntu 20.04 python 3.8 -- Latest version:1.4.0(Release version) +- Currently only support on Ubuntu 18.04 python 3.6 / Ubuntu 20.04 python 3.8 / Ubuntu 22.04 python 3.10 +- Latest version:1.5.0(Release version) # Feedback and Community Support -Two ways are followed: -- [Issues](https://github.com/rockchip-linux/rknn-toolkit2/issues) +- [Redmine](https://redmine.rock-chips.com) (**Feedback recommended, Please consult our sales or FAE for the redmine account**) - QQ Group Chat: 1025468710 (full, please join group 2) - QQ Group Chat2: 547021958
-
+ \ No newline at end of file diff --git a/Rockchip_Quick_Start_RKNN_SDK_V1.4.0_CN.pdf b/Rockchip_Quick_Start_RKNN_SDK_V1.5.0_CN.pdf similarity index 89% rename from Rockchip_Quick_Start_RKNN_SDK_V1.4.0_CN.pdf rename to Rockchip_Quick_Start_RKNN_SDK_V1.5.0_CN.pdf index 67eb9ed..a6dc278 100644 Binary files a/Rockchip_Quick_Start_RKNN_SDK_V1.4.0_CN.pdf and b/Rockchip_Quick_Start_RKNN_SDK_V1.5.0_CN.pdf differ diff --git a/doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.4.0.md b/doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.5.0.md similarity index 94% rename from doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.4.0.md rename to doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.5.0.md index 3888504..ff609b6 100644 --- a/doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.4.0.md +++ b/doc/RKNNToolKit2_API_Difference_With_Toolkit1-1.5.0.md @@ -56,7 +56,10 @@ float_dtype='float16', # new optimization_level=3, custom_string=None, # new - output_tensor_type=None) # new + remove_weight=False, # new + compress_weight=False, # new + inputs_yuv_fmt=None, # new + single_core_mode=False) # new - In addition to the above abandoned/new items, there are other differences: @@ -68,7 +71,7 @@ toolkit2: normal(default), mmse target_platform: toolkit1: rk1808, rk3399pro, rv1109, rv1126 - toolkit2: rk3566, rk3568, rk3588 + toolkit2: rk3566, rk3568, rk3588, rk3588s, rv1103, rv1106, rk3562 and newer ## rknn.load_tensorflow - Toolkit1: @@ -257,7 +260,7 @@ target: toolkit1: None(simulator), RK3399Pro, RK1808 - toolkit2: None(simulator), RK3566, RK3568, RK3588 + toolkit2: None(simulator), RK3566, RK3568, RK3588, RK3562 @@ -280,16 +283,14 @@ ## rknn.eval_perf - Toolkit1: - eval_perf(inputs=None, + eval_perf(inputs=None, # abandoned data_type=None, # abandoned - data_format=None, + data_format=None, # abandoned is_print=True, loop_cnt=1) # abandoned - Toolkit2: - eval_perf(inputs=None, - data_format=None, - is_print=True) + eval_perf(is_print=True) ## rknn.export_rknn_precompile_model diff --git a/doc/RKNNToolKit2_OP_Support-1.4.0.md b/doc/RKNNToolKit2_OP_Support-1.4.0.md deleted file mode 100644 index a1704f2..0000000 --- a/doc/RKNNToolKit2_OP_Support-1.4.0.md +++ /dev/null @@ -1,533 +0,0 @@ -# RKNNToolkit2 OPs Support - -## Explanation of terms: - -**Remarks**: - - Operators' specifications must meet the remarks' requirements. - -**Broadcast rule**: - -- per-layer: - - shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar ==> shape(result) = (2, 3, 4, 5) - -- per-channel: - - shape(A) = (2, 3, 4, 5), shape(B) = (3,), ==> shape(result) = (2, 3, 4, 5) - - shape(A) = (2, 3, 4, 5), shape(B) = (1,3,1,1), ==> shape(result) = (2, 3, 4, 5) - -- per-element: - - shape(A) = (2, 3, 4, 5), shape(B) = (2,3,4,5) ==> shape(result) = (2, 3, 4, 5) - -- other: - - shape(A) = (2, 3, 4, 5), shape(B) = (5,) ==> shape(result) = (2, 3, 4, 5) - -**Input Size Restrictions Description** - - -Assuming that input size is [N,H,W,C] (layout is NHWC) - -- Case 1: the first layer is **Convolution**, whose kernel size is [kernel_height, kernel_width] - - **W * kernel_height < 7168** - - **kernel_height * kernel_width < 128** - - -- Case 2: first layer is not Convolution, and C == 1 or C == 3 or C == 4 - - **W < 7168** - -- others: - - **No Restrictions** - - - - - - -## ONNX OPs supported by RKNN Toolkit2 - -According to [ONNX official instructions](https://github.com/microsoft/onnxruntime/blob/master/docs/Versioning.md 'ONNX Version Description'), the corresponding ONNX opset version is 12. -The list of ONNX OPs supported by RKNN Toolkit2 is as follows: -
(For more restrictions, please refer to [RKNN_Compiler_Support_Operator_List.pdf](https://github.com/rockchip-linux/rknpu2/tree/master/doc)) - -| **Operators** | **Remarks** | -| --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Abs | Not Supported | -| Acos | Not Supported | -| Acosh | Not Supported | -| Add | | -| And | Not Supported | -| ArgMax | | -| ArgMin | | -| Asin | Not Supported | -| Asinh | Not Supported | -| Atan | Not Supported | -| Atanh | Not Supported | -| AveragePool | **NPU Limit:**
channel: [1, 8192]
stride height/width: [1, 8]
pad left/right/top/bottom: [0, 7]
auto_pad: NOTSET
count_include_pad: 1
ceil_mode: 0 | -| BatchNormalization | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| BitShift | Not Supported | -| Cast | only support bool/int8/float | -| Ceil | Not Supported | -| Celu | Not Supported | -| Clip | | -| Compress | Not Supported | -| Concat | | -| ConcatFromSequence | Not Supported | -| Constant | | -| ConstantOfShape | | -| Conv | **NPU Limit:**
kernel height/width: [1, 31]
stride height/width: [1, 7]
pad left/right/top/bottom: [0, 15] | -| ConvInteger | Not Supported | -| ConvTranspose | **NPU Limit:**
kernel height/width: [1, 31]
stride height/width: 2, 4, 8
pad left/right/top/bottom: [0, 15] | -| Cos | Not Supported | -| Cosh | Not Supported | -| CumSum | Not Supported | -| DepthToSpace | | -| DequantizeLinear | | -| Det | | -| Div | **NPU Limit:**
support broadcast rule: per-element/other | -| Dropout | | -| Einsum | Not Supported | -| Elu | channel: [1, 8192]
height: [1, 8192]
width: [1, 8176]
| -| Equal | | -| Erf | Not Supported | -| Exp | | -| Expand | Not Supported | -| EyeLike | only support constant input | -| Flatten | | -| Floor | Not Supported | -| GRU | batchsize: 1 | -| Gather | | -| GatherElements | Not Supported | -| GatherND | Not Supported | -| Gemm | | -| GlobalAveragePool | channel: [1, 8192]
kernel height/width: [1, 343]
| -| GlobalLpPool | Not Supported | -| GlobalMaxPool | channel: [1, 8192]
kernel height/width: [1, 343]
| -| Greater | **NPU Limit:**
support broadcast rule: per-element/other | -| GreaterOrEqual | | -| HardSigmoid | | -| HardSwish | | -| Hardmax | Not Supported | -| Identity | | -| If | only support constant input | -| InstanceNormalization | | -| IsInf | Not Supported | -| IsNaN | Not Supported | -| LRN | | -| LSTM | batchsize: 1
input_forget: 0 | -| LeakyRelu | | -| Less | **NPU Limit:**
support broadcast rule: per-element/other | -| LessOrEqual | | -| Log | Not Supported | -| LogSoftmax | batchsize: 1 | -| Loop | Not Supported | -| LpNormalization | | -| LpPool | Not Supported | -| MatMul | | -| MatMulInteger | Not Supported | -| Max | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| MaxPool | **NPU Limit:**
channel: [1, 8192]
stride height/width: [1, 8]
pad left/right/top/bottom: [0, 7]
auto_pad: NOTSET
ceil_mode: 0
dilations: 1
storage_order: 0 | -| MaxRoiPool | | -| MaxUnpool | | -| Mean | Not Supported | -| Min | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| Mod | Not Supported | -| Mul | **NPU Limit:**
support broadcast rule: per-layer/channel/element | -| Multinomial | Not Supported | -| Neg | Not Supported | -| NonMaxSuppression | Not Supported | -| NonZero | Not Supported | -| Not | Not Supported | -| OneHot | Not Supported | -| Or | Not Supported | -| PRelu | slope support broadcast rule: per-layer/channel | -| Pad | **NPU Limit:**
width: [1, 8176]
mode: constant
pads n_begin/n_end/c_begin/c_end: 1 | -| Pow | | -| QLinearConv | Not Supported | -| QLinearMatMul | Not Supported | -| QuantizeLinear | | -| RNN | Not Supported | -| RandomNormal | Not Supported | -| RandomNormalLike | Not Supported | -| RandomUniform | Not Supported | -| RandomUniformLike | Not Supported | -| Range | Not Supported | -| Reciprocal | Not Supported | -| ReduceL1 | Not Supported | -| ReduceL2 | Not Supported | -| ReduceLogSum | Not Supported | -| ReduceLogSumExp | Not Supported | -| ReduceMax | | -| ReduceMean | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8192] | -| ReduceMin | | -| ReduceProd | Not Supported | -| ReduceSum | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8192] | -| ReduceSumSquare | | -| Relu | | -| Reshape | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| Resize | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176]
scales: [1, 8] | -| ReverseSequence | | -| RoiAlign | pool type: average
batchsize: 1 | -| Round | Not Supported | -| Scan | Not Supported | -| ScatterElements | Not Supported | -| ScatterND | Not Supported | -| Selu | Not Supported | -| SequenceAt | Not Supported | -| SequenceConstruct | Not Supported | -| SequenceEmpty | Not Supported | -| SequenceErase | Not Supported | -| SequenceInsert | Not Supported | -| SequenceLength | Not Supported | -| Shape | | -| Shrink | Not Supported | -| Sigmoid | | -| Sign | Not Supported | -| Sin | Not Supported | -| Sinh | Not Supported | -| Size | | -| Slice | batchsize: 1
**NPU Limit:**
steps: 1 | -| Softmax | batchsize: 1
**NPU Limit:**
channel: [1, 8192]
axis: 1 | -| Softplus | | -| Softsign | Not Supported | -| SpaceToDepth | | -| Split | | -| SplitToSequence | Not Supported | -| Sqrt | | -| Squeeze | | -| StringNormalizer | Not Supported | -| Sub | **NPU Limit:**
support broadcast rule: per-layer/channel/element | -| Sum | Not Supported | -| Tan | Not Supported | -| Tanh | | -| TfIdfVectorizer | Not Supported | -| ThresholdedRelu | Not Supported | -| Tile | batchsize: 1
not support broadcast | -| TopK | Not Supported | -| Transpose | **NPU Limit:**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| Trilu | Not Supported | -| Unique | Not Supported | -| Unsqueeze | | -| Where | | -| Xor | Not Supported | | | - - -## Caffe OPs supported by RKNN Toolkit2 - -Caffe protocols RKNN Toolkit2 uses only based on the officially modified protocol of berkeley. -The protocol based on the official revision of berkeley comes from [berkeley caffe](https://github.com/BVLC/caffe/tree/master/src/caffe/proto 'Berkeley Caffe'), commit hash is 21d0608. On this basis RKNN Toolkit2 have added some OPs. -Based on this protocol, the list of Caffe OPs supported by RKNN Toolkit2 is as follows: - -| **Operators** | **Remarks** | -| ---------------------- | ------------------------------------------------------------------------------------------------------------- | -| BatchNorm | same as onnx BatchNormalization | -| bn (BatchNorm + Scale) | same as onnx BatchNormalization according to https://github.com/TimoSaemann/caffe-segnet-cudnn5 | -| BNLL | | -| Concat | same as onnx Concat | -| Convolution | same as onnx Conv | -| ConvolutionDepthwise | kernel height/width: [1, 8]
others same as onnx Conv | -| Crop | | -| Deconvolution | same as ConvTranspose | -| Dropout | | -| Eltwise | support broadcast rule: per-layer/channel/element | -| Flatten | | -| HardSigmoid | | -| InnerProduct | same as onnx Gemm | -| LRN | same as onnx LRN | -| Lstm | same as onnx LSTM according to https://github.com/xmfbit/warpctc-caffe | -| Normalize | | -| Permute | same as onnx Transpose | -| Power | | -| Pooling | same as onnx pooling | -| PRelu | same as onnx PRelu | -| Proposal | batch: 1 | -| Reduction | output dims <= 4 | -| Relu | same as onnx Relu | -| Relu6 | same as onnx Clip | -| Reorg | | -| Reshape | same as onnx Reshape | -| Resize | bilinear; nearest | -| Reverse | | -| ROIPooling | same as MaxRoiPool according to https://github.com/twmht/caffe-pva-faster-rcnn | -| Scale | same as onnx Mul | -| Sigmoid | same as onnx Sigmoid | -| Slice | same as onnx Split | -| Softmax | same as onnx Softmax | -| Split | same as onnx Slice | -| TanH | same as onnx TanH | -| Tile | same as onnx Tile | -| Transpose | same as onnx Transpose | -| Upsample | according to https://github.com/SeanQ88/caffe_upsample and https://github.com/TimoSaemann/caffe-segnet-cudnn5 | - - -## Pytorch OPs supported by RKNN Toolkit2 - -The Pytorch version supported by RKNN Toolkit2 is >1.6.0, models generated by other versions may not support. -The list of Pytorch OPs supported by RKNN Toolkit2 is as follows: - -| **Operators** | **Remarks** | -| ----------------------------- | ---------------------------------------------------------------------------------- | -| aten::_convolution | same as onnx Conv | -| aten::abs | Not supported | -| aten::abs_ | Not supported | -| aten::adaptive_avg_pool1d | Not supported | -| aten::adaptive_avg_pool2d | same as onnx AveragePool | -| aten::adaptive_max_pool1d | Not supported | -| aten::adaptive_max_pool2d | same as onnx MaxPool | -| aten::add | same as onnx Add | -| aten::add_ | | -| aten::addmm | same as onnx Gemm | -| aten::affine_grid_generator | Not supported | -| aten::alpha_dropout | | -| aten::alpha_dropout_ | Not supported | -| aten::arange | Not supported | -| aten::avg_pool1d | Not supported | -| aten::avg_pool2d | same as onnx AveragePool | -| aten::avg_pool3d | Not supported | -| aten::batch_norm | same as onnx BatchNormalization | -| aten::bmm | same as onnx MatMul | -| aten::cat | same as onnx Concat | -| aten::celu | Not supported | -| aten::celu_ | Not supported | -| aten::chunk | | -| aten::clamp | | -| aten::clamp_ | | -| aten::clamp_max | Not supported | -| aten::clamp_max_ | Not supported | -| aten::clamp_min | Not supported | -| aten::clamp_min_ | Not supported | -| aten::clone | | -| aten::constant_pad_nd | same as onnx Pad | -| aten::contiguous | | -| aten::copy | | -| aten::cos | Not supported | -| aten::cos_ | Not supported | -| aten::cumsum | Not supported | -| aten::detach | | -| aten::detach_ | Not supported | -| aten::div | same as onnx Div | -| aten::div_ | | -| aten::dropout | | -| aten::dropout_ | | -| aten::einsum | Not supported | -| aten::elu | same as onnx Elu | -| aten::elu_ | | -| aten::embedding | same as onnx Gather | -| aten::empty | | -| aten::eq | Not supported | -| aten::eq_ | Not supported | -| aten::erf | Not supported | -| aten::erf_ | Not supported | -| aten::erfc | Not supported | -| aten::erfc_ | Not supported | -| aten::exp | | -| aten::exp_ | | -| aten::expand | Not supported | -| aten::expand_as | Not supported | -| aten::expm1 | Not supported | -| aten::expm1_ | Not supported | -| aten::feature_dropout | | -| aten::feature_dropout_ | Not supported | -| aten::flatten | | -| aten::floor | Not supported | -| aten::floor_ | Not supported | -| aten::floor_divide | Not supported | -| aten::floor_divide_ | Not supported | -| aten::gather | Not supported | -| aten::ge | Not supported | -| aten::ge_ | Not supported | -| aten::gelu | | -| aten::gelu_ | Not supported | -| aten::grid_sampler | Not supported | -| aten::gru | | -| aten::gt | | -| aten::gt_ | Not supported | -| aten::hardshrink | Not supported | -| aten::hardshrink_ | Not supported | -| aten::hardswish | same as onnx HardSwish | -| aten::hardswish_ | | -| aten::hardtanh | | -| aten::hardtanh_ | | -| aten::index | Not supported | -| aten::index_put | Not supported | -| aten::index_put_ | Not supported | -| aten::instance_norm | same as onnx InstanceNormalization | -| aten::Int | | -| aten::layer_norm | **NPU Limit**
channel: [1, 8192]
height: [1, 8192]
width: [1, 8192] | -| aten::le | Not supported | -| aten::le_ | Not supported | -| aten::leaky_relu | same as onnx LeakyRelu | -| aten::leaky_relu_ | | -| aten::lerp | Not supported | -| aten::lerp_ | Not supported | -| aten::log | Not supported | -| aten::log_ | Not supported | -| aten::log10 | Not supported | -| aten::log10_ | Not supported | -| aten::log1p | Not supported | -| aten::log1p_ | Not supported | -| aten::log2 | Not supported | -| aten::log2_ | Not supported | -| aten::log_sigmoid | Not supported | -| aten::log_softmax | Not supported | -| aten::linear | same as onnx Gemm | -| aten::lstm | same as onnx LSTM | -| aten::lt | | -| aten::lt_ | Not supported | -| aten::matmul | same as onnx MatMul | -| aten::max | | -| aten::max_ | Not supported | -| aten::max_pool1d | same as onnx MaxPool | -| aten::max_pool1d_with_indices | | -| aten::max_pool2d | same as onnx MaxPool | -| aten::max_pool2d_with_indices | | -| aten::mean | same as onnx ReduceMean | -| aten::meshgrid | Not supported | -| aten::min | | -| aten::min_ | Not supported | -| aten::mm | same as onnx MatMul | -| aten::mul | same as onnx Mul | -| aten::mul_ | | -| aten::narrow | same as onnx Slice | -| aten::ne | Not supported | -| aten::ne_ | Not supported | -| aten::neg | Not supported | -| aten::neg_ | Not supported | -| aten::new_full | Not supported | -| aten::new_zeros | Not supported | -| aten::nonzero | Not supported | -| aten::norm | Not supported | -| aten::ones | | -| aten::ones_like | | -| aten::pad | Not supported | -| aten::permute | same as onnx Transpose | -| aten::pow | | -| aten::pow_ | Not supported | -| aten::prelu | same as onnx PRelu | -| aten::prelu_ | Not supported | -| aten::reciprocal | | -| aten::reciprocal_ | Not supported | -| aten::reflection_pad1d | | -| aten::reflection_pad2d | | -| aten::relu | same as onnx Relu | -| aten::relu_ | | -| aten::repeat | | -| aten::reshape | | -| aten::reshape_ | Not supported | -| torchvision::roi_align | Not supported | -| aten::rsqrt | Not supported | -| aten::rsqrt_ | Not supported | -| aten::ScalarImplicit | | -| aten::select | | -| aten::selu | Not supported | -| aten::selu_ | Not supported | -| aten::sigmoid | same as onnx Sigmoid | -| aten::sigmoid_ | | -| aten::silu | | -| aten::silu_ | | -| aten::sin | Not supported | -| aten::sin_ | Not supported | -| aten::size | | -| aten::slice | same as onnx Slice | -| aten::softmax | same as onnx Softmax | -| aten::softplus | | -| aten::softshrink | Not supported | -| aten::sort | Not supported | -| aten::split | same as onnx Split | -| aten::split_with_sizes | | -| aten::sqrt | Not supported | -| aten::sqrt_ | Not supported | -| aten::squeeze | | -| aten::squeeze_ | Not supported | -| aten::stack | | -| aten::sub | same as onnx Sub | -| aten::sub_ | | -| aten::sum | same as onnx ReduceSum | -| aten::t | | -| aten::t_ | Not supported | -| aten::tanh | | -| aten::tanh_ | | -| aten::threshold | | -| aten::threshold_ | | -| aten::to | | -| aten::topk | Not supported | -| aten::transpose | | -| aten::transpose_ | | -| aten::true_divide | same as onnx Div | -| aten::true_divide_ | Not supported | -| aten::type_as | | -| aten::unfold | Not supported | -| aten::unsqueeze | | -| aten::upsample_bilinear2d | | -| aten::upsample_nearest2d | | -| aten::view | | -| aten::view_ | Not supported | -| aten::view_as | Not supported | -| aten::view_as_ | Not supported | -| aten::zero_ | Not supported | -| aten::zeros | | -| aten::zeros_like | | - -## TensorFlow OPs supported by RKNN Toolkit2 - -The pb files (contain OPs belows) generated by TensorFlow version 1.12 - 1.15 for 1.x and 2.3 - 2.5 for 2.x are supported by RKNN Toolkit2. For more information on TensorFlow version compatibility, please refer to [tensorflow official instructions on OP version](https://www.tensorflow.org/guide/versions 'Tensorflow official instructions on OP version') . -The list of TensorFlow OPs supported by RKNN Toolkit2 is as follows: - -| **Operators** | **Remarks** | -| --------------------- | --------------------------------------------------------------- | -| Add | same as onnx Add | -| AvgPool | same as onnx AveragePool | -| Concat | same as onnx Concat | -| Conv2D | same as onnx Conv | -| DepthToSpace | | -| DepthwiseConv2d | kernel height/width: [1, 8]
others same as onnx Conv | -| Div | same as onnx Div | -| Dropout | | -| Flatten | | -| LeakyRelu | same as onnx LeakyRelu | -| Less | same as onnx Less | -| LRN | | -| MatMul | | -| MaxPool | same as onnx MaxPool | -| Mean | output dims <= 4 | -| Pad | same as onnx Pad | -| Relu | same as onnx Relu | -| Reshape | | -| ResizeBilinear | | -| ResizeNearestNeighbor | | -| Sigmoid | | -| Slice | | -| Softmax | | -| Softplus | channel: [1, 8192]
height: [1, 8192]
width: [1, 8176] | -| SpaceToDepth | | -| Split | | -| Squeeze | | -| StridedSlice | | -| Tanh | same as onnx TanH | -| Transpose | | - -## Darknet OPs supported by RKNN Toolkit2 -The list of Darknet OPs supported by RKNN Toolkit2 is as follows: - -| **Operators** | **Remarks** | -| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| add | same as onnx Add | -| batchnormalize | same as onnx BatchNormalization | -| concat | same as onnx Concat | -| convolutional | same as onnx Conv | -| depthwise_convolutional | kernel height/width: [1, 8]
others same as onnx Conv | -| fullconnect | | -| leakyrelu | same as onnx LeakyRelu | -| mish | | -| pooling | **AveragePool**: same as onnx AveragePool
**GlobalAveragePool**: same as onnx GlobalAveragePool
**MaxPool/GlobalMaxPool**: same as onnx MaxPool/GlobalMaxPool | -| route | | -| shortcut | | -| softmax | | -| upsampling | | \ No newline at end of file diff --git a/doc/RKNNToolKit2_OP_Support-1.5.0.md b/doc/RKNNToolKit2_OP_Support-1.5.0.md new file mode 100644 index 0000000..26fe970 --- /dev/null +++ b/doc/RKNNToolKit2_OP_Support-1.5.0.md @@ -0,0 +1,492 @@ +# RKNNToolkit2 OPs Support + +## ONNX OPs supported by RKNN Toolkit2 + +According to [ONNX official instructions](https://github.com/microsoft/onnxruntime/blob/master/docs/Versioning.md 'ONNX Version Description'), the corresponding ONNX opset version is 12. +The list of ONNX OPs supported by RKNN Toolkit2 is as follows: +
(For more restrictions, please refer to [RKNN_Compiler_Support_Operator_List.pdf](https://github.com/rockchip-linux/rknpu2/tree/master/doc)) + +| **Operators** | **Remarks** | +| --------------------- | --------------------------------------- | +| Abs | Not Supported | +| Acos | Not Supported | +| Acosh | Not Supported | +| Add | | +| And | Not Supported | +| ArgMax | | +| ArgMin | | +| Asin | Not Supported | +| Asinh | Not Supported | +| Atan | Not Supported | +| Atanh | Not Supported | +| AveragePool | | +| BatchNormalization | | +| BitShift | Not Supported | +| Cast | | +| Ceil | Not Supported | +| Celu | Not Supported | +| Clip | | +| Compress | Not Supported | +| Concat | | +| ConcatFromSequence | Not Supported | +| Constant | | +| ConstantOfShape | | +| Conv | | +| ConvInteger | Not Supported | +| ConvTranspose | | +| Cos | | +| Cosh | Not Supported | +| CumSum | Not Supported | +| DepthToSpace | | +| DequantizeLinear | | +| Det | Not Supported | +| Div | | +| Dropout | | +| Einsum | Not Supported | +| Elu | | +| Equal | | +| Erf | Not Supported | +| Exp | | +| Expand | Not Supported | +| EyeLike | only support constant input | +| Flatten | | +| Floor | Not Supported | +| GRU | batchsize: 1 | +| Gather | | +| GatherElements | Not Supported | +| GatherND | Not Supported | +| Gemm | | +| GlobalAveragePool | | +| GlobalLpPool | Not Supported | +| GlobalMaxPool | | +| Greater | | +| GreaterOrEqual | | +| HardSigmoid | | +| HardSwish | | +| Hardmax | Not Supported | +| Identity | | +| If | only support constant input | +| InstanceNormalization | | +| IsInf | Not Supported | +| IsNaN | Not Supported | +| LRN | | +| LSTM | | +| LeakyRelu | | +| Less | | +| LessOrEqual | | +| Log | Not Supported | +| LogSoftmax | batchsize: 1 | +| Loop | Not Supported | +| LpNormalization | | +| LpPool | Not Supported | +| MatMul | | +| MatMulInteger | Not Supported | +| Max | | +| MaxPool | | +| MaxRoiPool | | +| MaxUnpool | | +| Mean | Not Supported | +| Min | | +| Mod | Not Supported | +| Mul | | +| Multinomial | Not Supported | +| Neg | Not Supported | +| NonMaxSuppression | Not Supported | +| NonZero | Not Supported | +| Not | Not Supported | +| OneHot | Not Supported | +| Or | Not Supported | +| PRelu | | +| Pad | | +| Pow | | +| QLinearConv | Not Supported | +| QLinearMatMul | Not Supported | +| QuantizeLinear | | +| RNN | Not Supported | +| RandomNormal | Not Supported | +| RandomNormalLike | Not Supported | +| RandomUniform | Not Supported | +| RandomUniformLike | Not Supported | +| Range | Not Supported | +| Reciprocal | Not Supported | +| ReduceL1 | Not Supported | +| ReduceL2 | Not Supported | +| ReduceLogSum | Not Supported | +| ReduceLogSumExp | Not Supported | +| ReduceMax | | +| ReduceMean | | +| ReduceMin | | +| ReduceProd | Not Supported | +| ReduceSum | | +| ReduceSumSquare | Not Supported | +| Relu | | +| Reshape | | +| Resize | mode: nearest2d/bilinear | +| ReverseSequence | | +| RoiAlign | pool type: average
batchsize: 1 | +| Round | Not Supported | +| Scan | Not Supported | +| ScatterElements | Not Supported | +| ScatterND | | +| Selu | Not Supported | +| SequenceAt | Not Supported | +| SequenceConstruct | Not Supported | +| SequenceEmpty | Not Supported | +| SequenceErase | Not Supported | +| SequenceInsert | Not Supported | +| SequenceLength | Not Supported | +| Shape | | +| Shrink | Not Supported | +| Sigmoid | | +| Sign | Not Supported | +| Sin | | +| Sinh | Not Supported | +| Size | | +| Slice | batchsize: 1 | +| Softmax | batchsize: 1 | +| Softplus | | +| Softsign | Not Supported | +| SpaceToDepth | | +| Split | | +| SplitToSequence | Not Supported | +| Sqrt | | +| Squeeze | | +| StringNormalizer | Not Supported | +| Sub | | +| Sum | Not Supported | +| Tan | Not Supported | +| Tanh | | +| TfIdfVectorizer | Not Supported | +| ThresholdedRelu | Not Supported | +| Tile | batchsize: 1
not support broadcast | +| TopK | Not Supported | +| Transpose | | +| Trilu | Not Supported | +| Unique | Not Supported | +| Unsqueeze | | +| Where | | +| Xor | Not Supported | + +## Pytorch OPs supported by RKNN Toolkit2 + +The Pytorch version supported by RKNN Toolkit2 is >1.6.0, models generated by other versions may not support. +The list of Pytorch OPs supported by RKNN Toolkit2 is as follows: + +| **Operators** | **Remarks** | +| ----------------------------- | ---------------------------------- | +| aten::_convolution | same as onnx Conv | +| aten::abs | Not supported | +| aten::abs_ | Not supported | +| aten::adaptive_avg_pool1d | Not supported | +| aten::adaptive_avg_pool2d | same as onnx AveragePool | +| aten::adaptive_max_pool1d | Not supported | +| aten::adaptive_max_pool2d | same as onnx MaxPool | +| aten::add | same as onnx Add | +| aten::add_ | | +| aten::addmm | same as onnx Gemm | +| aten::affine_grid_generator | Not supported | +| aten::alpha_dropout | | +| aten::alpha_dropout_ | Not supported | +| aten::arange | Not supported | +| aten::avg_pool1d | Not supported | +| aten::avg_pool2d | same as onnx AveragePool | +| aten::avg_pool3d | Not supported | +| aten::batch_norm | same as onnx BatchNormalization | +| aten::bmm | same as onnx MatMul | +| aten::cat | same as onnx Concat | +| aten::celu | Not supported | +| aten::celu_ | Not supported | +| aten::chunk | | +| aten::clamp | | +| aten::clamp_ | | +| aten::clamp_max | Not supported | +| aten::clamp_max_ | Not supported | +| aten::clamp_min | | +| aten::clamp_min_ | Not supported | +| aten::clone | | +| aten::constant_pad_nd | same as onnx Pad | +| aten::contiguous | | +| aten::copy | | +| aten::cos | Not supported | +| aten::cos_ | Not supported | +| aten::cumsum | Not supported | +| aten::detach | | +| aten::detach_ | Not supported | +| aten::div | same as onnx Div | +| aten::div_ | | +| aten::dropout | | +| aten::dropout_ | | +| aten::einsum | Not supported | +| aten::elu | same as onnx Elu | +| aten::elu_ | | +| aten::embedding | same as onnx Gather | +| aten::empty | | +| aten::eq | Not supported | +| aten::eq_ | Not supported | +| aten::erf | Not supported | +| aten::erf_ | Not supported | +| aten::erfc | Not supported | +| aten::erfc_ | Not supported | +| aten::exp | | +| aten::exp_ | | +| aten::expand | | +| aten::expand_as | Not supported | +| aten::expm1 | Not supported | +| aten::expm1_ | Not supported | +| aten::feature_dropout | | +| aten::feature_dropout_ | Not supported | +| aten::flatten | | +| aten::flip | Not supported | +| aten::floor | Not supported | +| aten::floor_ | Not supported | +| aten::floor_divide | Not supported | +| aten::floor_divide_ | Not supported | +| aten::gather | Not supported | +| aten::ge | Not supported | +| aten::ge_ | Not supported | +| aten::gelu | | +| aten::gelu_ | Not supported | +| aten::grid_sampler | Not supported | +| aten::gru | | +| aten::gt | | +| aten::gt_ | Not supported | +| aten::hardshrink | Not supported | +| aten::hardshrink_ | Not supported | +| aten::hardswish | same as onnx HardSwish | +| aten::hardswish_ | | +| aten::hardtanh | | +| aten::hardtanh_ | | +| aten::index | Not supported | +| aten::index_put | Not supported | +| aten::index_put_ | Not supported | +| aten::instance_norm | same as onnx InstanceNormalization | +| aten::Int | | +| aten::layer_norm | | +| aten::le | Not supported | +| aten::le_ | Not supported | +| aten::leaky_relu | same as onnx LeakyRelu | +| aten::leaky_relu_ | | +| aten::lerp | Not supported | +| aten::lerp_ | Not supported | +| aten::log | Not supported | +| aten::log_ | Not supported | +| aten::log10 | Not supported | +| aten::log10_ | Not supported | +| aten::log1p | Not supported | +| aten::log1p_ | Not supported | +| aten::log2 | Not supported | +| aten::log2_ | Not supported | +| aten::log_sigmoid | Not supported | +| aten::log_softmax | Not supported | +| aten::linear | same as onnx Gemm | +| aten::lstm | same as onnx LSTM | +| aten::lt | | +| aten::lt_ | Not supported | +| aten::matmul | same as onnx MatMul | +| aten::max | | +| aten::maximum | | +| aten::max_ | Not supported | +| aten::max_pool1d | same as onnx MaxPool | +| aten::max_pool1d_with_indices | | +| aten::max_pool2d | same as onnx MaxPool | +| aten::max_pool2d_with_indices | | +| aten::mean | same as onnx ReduceMean | +| aten::meshgrid | Not supported | +| aten::min | | +| aten::minimum | | +| aten::min_ | Not supported | +| aten::mish | | +| aten::mm | same as onnx MatMul | +| aten::mul | same as onnx Mul | +| aten::mul_ | | +| aten::narrow | same as onnx Slice | +| aten::ne | | +| aten::ne_ | Not supported | +| aten::neg | Not supported | +| aten::neg_ | Not supported | +| aten::new_full | Not supported | +| aten::new_zeros | Not supported | +| aten::nonzero | Not supported | +| aten::norm | Not supported | +| aten::ones | | +| aten::ones_like | | +| aten::pad | Not supported | +| aten::permute | same as onnx Transpose | +| aten::pow | | +| aten::pow_ | Not supported | +| aten::prelu | same as onnx PRelu | +| aten::prelu_ | Not supported | +| aten::prod | | +| aten::reciprocal | | +| aten::reciprocal_ | Not supported | +| aten::reflection_pad1d | | +| aten::reflection_pad2d | | +| aten::relu | same as onnx Relu | +| aten::relu6 | same as onnx Relu | +| aten::relu_ | | +| aten::relu6_ | | +| aten::repeat | | +| aten::reshape | | +| aten::reshape_ | Not supported | +| torchvision::roi_align | Not supported | +| aten::rsqrt | Not supported | +| aten::rsqrt_ | Not supported | +| aten::ScalarImplicit | | +| aten::select | | +| aten::selu | Not supported | +| aten::selu_ | Not supported | +| aten::sigmoid | same as onnx Sigmoid | +| aten::sigmoid_ | | +| aten::silu | | +| aten::silu_ | | +| aten::sin | Not supported | +| aten::sin_ | Not supported | +| aten::size | | +| aten::slice | same as onnx Slice | +| aten::softmax | same as onnx Softmax | +| aten::softplus | | +| aten::softshrink | Not supported | +| aten::sort | Not supported | +| aten::split | same as onnx Split | +| aten::split_with_sizes | | +| aten::sqrt | Not supported | +| aten::sqrt_ | Not supported | +| aten::squeeze | | +| aten::squeeze_ | Not supported | +| aten::stack | | +| aten::sub | same as onnx Sub | +| aten::sub_ | | +| aten::sum | same as onnx ReduceSum | +| aten::t | | +| aten::t_ | Not supported | +| aten::tanh | | +| aten::tanh_ | | +| aten::threshold | | +| aten::threshold_ | | +| aten::to | | +| aten::topk | Not supported | +| aten::transpose | | +| aten::transpose_ | | +| aten::true_divide | same as onnx Div | +| aten::true_divide_ | Not supported | +| aten::type_as | | +| aten::unfold | Not supported | +| aten::unsqueeze | | +| aten::upsample_bilinear2d | | +| aten::upsample_nearest2d | | +| aten::view | | +| aten::view_ | Not supported | +| aten::view_as | Not supported | +| aten::view_as_ | Not supported | +| aten::where | | +| aten::zero_ | Not supported | +| aten::zeros | | +| aten::zeros_like | | + + + + +## Caffe OPs supported by RKNN Toolkit2 + +Caffe protocols RKNN Toolkit2 uses only based on the officially modified protocol of berkeley. +The protocol based on the official revision of berkeley comes from [berkeley caffe](https://github.com/BVLC/caffe/tree/master/src/caffe/proto 'Berkeley Caffe'), commit hash is 21d0608. On this basis RKNN Toolkit2 have added some OPs. +Based on this protocol, the list of Caffe OPs supported by RKNN Toolkit2 is as follows: + +| **Operators** | **Remarks** | +| ---------------------- | ------------------------------------------------------------------------------------------------------------- | +| BatchNorm | same as onnx BatchNormalization | +| bn (BatchNorm + Scale) | same as onnx BatchNormalization according to https://github.com/TimoSaemann/caffe-segnet-cudnn5 | +| BNLL | | +| Concat | same as onnx Concat | +| Convolution | same as onnx Conv | +| ConvolutionDepthwise | kernel height/width: [1, 8]
others same as onnx Conv | +| Crop | | +| Deconvolution | same as ConvTranspose | +| Dropout | | +| Eltwise | | +| Flatten | | +| HardSigmoid | | +| InnerProduct | same as onnx Gemm | +| LRN | same as onnx LRN | +| Lstm | same as onnx LSTM according to https://github.com/xmfbit/warpctc-caffe | +| Normalize | | +| Permute | same as onnx Transpose | +| Power | | +| Pooling | same as onnx pooling | +| PRelu | same as onnx PRelu | +| Proposal | batch: 1 | +| Reduction | output dims <= 4 | +| Relu | same as onnx Relu | +| Relu6 | same as onnx Clip | +| Reorg | | +| Reshape | same as onnx Reshape | +| Resize | bilinear; nearest | +| Reverse | | +| ROIPooling | same as MaxRoiPool according to https://github.com/twmht/caffe-pva-faster-rcnn | +| Scale | same as onnx Mul | +| Sigmoid | same as onnx Sigmoid | +| Slice | same as onnx Split | +| Softmax | same as onnx Softmax | +| Split | same as onnx Slice | +| TanH | same as onnx TanH | +| Tile | same as onnx Tile | +| Transpose | same as onnx Transpose | +| Upsample | according to https://github.com/SeanQ88/caffe_upsample and https://github.com/TimoSaemann/caffe-segnet-cudnn5 | + + +## TensorFlow OPs supported by RKNN Toolkit2 + +The pb files (contain OPs belows) generated by TensorFlow version 1.12 - 1.15 for 1.x and 2.3 - 2.5 for 2.x are supported by RKNN Toolkit2. For more information on TensorFlow version compatibility, please refer to [tensorflow official instructions on OP version](https://www.tensorflow.org/guide/versions 'Tensorflow official instructions on OP version') . +The list of TensorFlow OPs supported by RKNN Toolkit2 is as follows: + +| **Operators** | **Remarks** | +| --------------------- | --------------------------------------------------------- | +| Add | same as onnx Add | +| AvgPool | same as onnx AveragePool | +| Concat | same as onnx Concat | +| Conv2D | same as onnx Conv | +| DepthToSpace | | +| DepthwiseConv2d | kernel height/width: [1, 8]
others same as onnx Conv | +| Div | same as onnx Div | +| Dropout | | +| Flatten | | +| LeakyRelu | same as onnx LeakyRelu | +| Less | same as onnx Less | +| LRN | | +| MatMul | | +| MaxPool | same as onnx MaxPool | +| Mean | output dims <= 4 | +| Pad | same as onnx Pad | +| Relu | same as onnx Relu | +| Reshape | | +| ResizeBilinear | | +| ResizeNearestNeighbor | | +| Sigmoid | | +| Slice | | +| Softmax | | +| Softplus | same as onnx Softplus | +| SpaceToDepth | | +| Split | | +| Squeeze | | +| StridedSlice | | +| Tanh | same as onnx TanH | +| Transpose | | + +## Darknet OPs supported by RKNN Toolkit2 +The list of Darknet OPs supported by RKNN Toolkit2 is as follows: + +| **Operators** | **Remarks** | +| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| add | same as onnx Add | +| batchnormalize | same as onnx BatchNormalization | +| concat | same as onnx Concat | +| convolutional | same as onnx Conv | +| depthwise_convolutional | kernel height/width: [1, 8]
others same as onnx Conv | +| fullconnect | | +| leakyrelu | same as onnx LeakyRelu | +| mish | | +| pooling | **AveragePool**: same as onnx AveragePool
**GlobalAveragePool**: same as onnx GlobalAveragePool
**MaxPool/GlobalMaxPool**: same as onnx MaxPool/GlobalMaxPool | +| route | | +| shortcut | | +| softmax | | +| upsampling | | \ No newline at end of file diff --git a/doc/RKNN_Compiler_Support_Operator_List_v1.4.0.pdf b/doc/RKNN_Compiler_Support_Operator_List_v1.4.0.pdf deleted file mode 100644 index dfa2f9f..0000000 Binary files a/doc/RKNN_Compiler_Support_Operator_List_v1.4.0.pdf and /dev/null differ diff --git a/doc/RKNN_Compiler_Support_Operator_List_v1.5.0.pdf b/doc/RKNN_Compiler_Support_Operator_List_v1.5.0.pdf new file mode 100644 index 0000000..7aeb0a6 Binary files /dev/null and b/doc/RKNN_Compiler_Support_Operator_List_v1.5.0.pdf differ diff --git a/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.4.0.pdf b/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.4.0.pdf deleted file mode 100644 index 37d887b..0000000 Binary files a/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.4.0.pdf and /dev/null differ diff --git a/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.5.0.pdf b/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.5.0.pdf new file mode 100644 index 0000000..cc7a7b5 Binary files /dev/null and b/doc/Rockchip_Quick_Start_RKNN_Toolkit2_CN-1.5.0.pdf differ diff --git a/doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.4.0.pdf b/doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.5.0.pdf similarity index 59% rename from doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.4.0.pdf rename to doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.5.0.pdf index 61392ec..e7e7c95 100644 Binary files a/doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.4.0.pdf and b/doc/Rockchip_Quick_Start_RKNN_Toolkit2_EN-1.5.0.pdf differ diff --git a/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_CN-1.5.0.pdf b/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_CN-1.5.0.pdf new file mode 100644 index 0000000..f232789 Binary files /dev/null and b/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_CN-1.5.0.pdf differ diff --git a/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_EN-1.5.0.pdf b/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_EN-1.5.0.pdf new file mode 100644 index 0000000..1ed195e Binary files /dev/null and b/doc/Rockchip_Trouble_Shooting_RKNN_Toolkit2_EN-1.5.0.pdf differ diff --git a/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.4.0.pdf b/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.4.0.pdf deleted file mode 100644 index b921d7f..0000000 Binary files a/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.4.0.pdf and /dev/null differ diff --git a/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.5.0.pdf b/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.5.0.pdf new file mode 100644 index 0000000..ad479af Binary files /dev/null and b/doc/Rockchip_User_Guide_RKNN_Toolkit2_CN-1.5.0.pdf differ diff --git a/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.4.0.pdf b/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.4.0.pdf deleted file mode 100644 index a18f99b..0000000 Binary files a/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.4.0.pdf and /dev/null differ diff --git a/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.5.0.pdf b/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.5.0.pdf new file mode 100644 index 0000000..fbcfbe7 Binary files /dev/null and b/doc/Rockchip_User_Guide_RKNN_Toolkit2_EN-1.5.0.pdf differ diff --git a/doc/changelog-1.4.0.txt b/doc/changelog-1.4.0.txt deleted file mode 100644 index a0c554f..0000000 --- a/doc/changelog-1.4.0.txt +++ /dev/null @@ -1,123 +0,0 @@ -2022-8-20 -版本: v1.4.0: -更新内容: -1. 升级相关依赖包到主流版本 -2. 添加更多2/3/5维度的Op支持 -3. 更新config/init_runtime等接口 -4. 更新LSTM等Op支持 -5. 添加yuv输入支持 -6. 更新QAT模型支持 - -2022-7-2 -版本: v1.3.4b5: -更新内容: -1. rknn-toolkit2: - 1) optimize_onnx接口 - a. 在设置optimization_level=2时,关闭conv+add融合。 - b. 保留BatchNormalize算子带的量化参数。 - 2) RK3588屏蔽NPU直接输出NHWC layout的支持, RK3566/RV1106保留该功能。 -2. C API: - 1) RK3588/RK3566/RV1106支持传入一个包含rknn模型的大文件路径,rknn_init接口设置包含偏移和真实rknn模型大小的rknn_init_extend结构体指针。 - - -2021-4-22 -版本: v1.3.0: -更新内容: -1. 新功能: python3.8/ubuntu20.04 平台支持 -2. 修复一些已知的bug: - 1) 修复了一些图优化和量化bug - -2021-4-7 -版本: v1.2.5: -更新内容: -1. 新功能: rv1103/rv1109平台的支持. -2. 修复一些已知的bug: - 1) 修复了一些QAT模型转换问题 - 2) 修复了一些图优化bug - - -2021-1-27 -版本: v1.2.1-beta: -更新内容: -1. 新功能: 多batch的NHWC格式输入时,在H维度,有效元素个数与实际内存中的元素个数不一致时,支持H方向实际元素个数按照h_stride设置. -2. 修复一些已知的bug: - 1) LSTM算子内部变量重名的问题. - - -2021-1-12 -版本:v1.2.0 -更新内容: -1. 新功能: rk3588平台的支持; rknn模型加密支持; tensorflow/tflite/pytorch量化感知模型支持; 增加了一些新的 op 支持: InstanceNormalization, Swish, Conv1D等(详见 op support list);增加了参数量计算以及算力分析 -2. examples 更新:增加了从 pytorch 转 onnx 的转换 demo:resnet18_export_onnx ;增加了pytorch量化感知模型的加载demo:resnet18_qat demo;增加了模型加密功能:添加了3588平台 rknn 转换 demo -3. 接口更改:移除了 config,load_caffe,load_tensorflow等接口的一些不必要的参数设置,更新了 eval_perf 接口,详细改动见Uer_Guide文档 -4. 修复一些已知的bug: - 1) 修复了一些模型无法转换rknn的问题 - 2) 修复了一些图优化bug - 3) 修复了一些模型推理结果错误的问题 - 4) 修复了 pytorch、tflite 某些 op 转换失败的问题 -5. 优化: 精度分析耗时优化; 模型转换和量化耗时优化 - - -2021-8-12 -版本:v1.1.0 -更新内容: -1. 新功能: LSTM,GRU的支持;增加了accuracy_analysis对比项目;增加了一些op支持:caffe hardswish;onnx gather,reduceMax等op;更新了更全面的图优化规则。 -2. examples更新:增加了yolov5的demo -3. 修复一些已知的bug: - 1)修复了一些模拟器的推理结果错误问题 - 2)修复了一些图优化bug - 3)修复了一些大模型无法转换rknn的问题 - 4)修复了多输入的转换和推理bug -4. 更新了文档,更新了OP支持列表 - -2021-6-30 -版本:v1.1.0beta -更新内容: -1. 新功能: 混合量化功能(支持自定义是否量化以及量化参数修改);完善了 accuracy_analysis 对比功能(包括连板对比结果) -2. examples更新:增加了常用接口的demo示例:accuracy_analysis、batch_size、hybrid_quant、load_quantized_model、mmse、multi_input_test -3. 修复一些已知的bug: - 1)修复了一些int8/fp16模型的转换问题以及op精度问题 - 2)修复了一些图优化bug,修复了一些依赖的版本问题 -4. 更新了文档,更新了OP支持列表 - - -2021-4-30 -版本:v1.0.0 -更新内容: -1. 新功能: 卷积类的per channel量化功能;添加了config中custom_inf的模型信息设置、img_quant_RGB2BGR设置;添加了eval performance的性能测试接口;增加了版本打印功能 -2. OP支持:1) 添加了Caffe新OP支持:Power/Tile/Eltwise(Max)/去除了normalize维度的限制; 2) 添加了onnx新OP支持:HardSigmoid/Pow/Tile -3. 修复一些已知的bug: - 1) 修复了caffe FC的输出shape以及name的错误 - 2) 优化了mmse的量化性能 - 3)修复caffe的Pooling层的输出shape计算错误 - 4)修复了caffe slice丢弃了其中一个输出的inference bug - 5)修复了一些模型优化的bug -4. 弃置了reorder_channel的config设置,由用户自行保证inference输入数据的channel正确性 -5. 更新了文档,更新了OP支持列表 - - -2021-4-2 -版本:v0.7.0 -更新内容: -1. 新功能: 新的量化算法支持(mmse), 添加支持tensorflow的预量化模型导入 -2. 添加了Caffe新OP支持:relu6/ConvolutionDepthwise/Transpose/reorg -3. 修复一些已知的bug: - 1) 增加concat的非channel维度,非4维输入的支持 - 2) 修复了第一层是scale的预处理bug - 3)更新了onnxruntime==1.7.0的版本 -4. 更新了文档,更新了OP支持列表 - -2021-3-1 -版本:v0.6.0 -更新内容: -1. 新功能: caffe load API添加指定输入name的接口;添加了caffe lrn(WithinChannel)的支持 -2. 添加了Caffe新OP支持:crop/flatten/normalize/proposal/reduction -3. 添加了onnx/pytorch/tensorflow/darknet/tflite新OP支持 -4. 移除了aciq以及Kl散度量化功能 -5. 修复一些已知的bug: - 1) 最后一层是reshape转换bug; - 2) 修复了caffe中InnerProduct随机生成blob的bug; - 3) 修复了过大的size导致GlobalAvgPool GlobalMaxPool crash的问题; - 4) 修复了第一层是RoIpooling的维度错误; - 5) 修复了SSD设备端推理错误的问题等。 -6. 更新了文档,增加了OP支持列表 diff --git a/doc/changelog-1.5.0.txt b/doc/changelog-1.5.0.txt new file mode 100644 index 0000000..67f8fc1 --- /dev/null +++ b/doc/changelog-1.5.0.txt @@ -0,0 +1,457 @@ +2023-5-18 +版本: v1.5.0: +更新内容: +1. 更新config.dynamic_input的接口定义 +2. 修复部分op属性获取失败的问题 + +2023-5-17 +版本: v1.4.6b2: +更新内容: +1. 修复RK3562 多batch rknn模型C API运行错误的Bug。 + +2023-5-15 +版本: v1.4.6b1: +更新内容: +1. 修复普通API多输入多输出模型动态shape出错的bug +2. 增加RK3562 Matmul API支持 +3. 修复第一层为Reshape时dynamic_input失败的问题 +4. 修复opset12~15可能存在的问题 + +2023-5-11 +版本: v1.4.6b0: +更新内容: +1. 优化RKNN_FLAG_COLLECT_MODEL_INFO_ONLY初始化效率 +2. 修复1x1x1x1两个feature Add算子转换Bug +3. 修复load_rknn加载老版本模型出错的兼容性问题 +4. 添加opset13/14/15的部分支持 (试验性质) +5. 修复eval_perf导出csv时可能会报错的问题 +6. 修复load_rknn报错问题 +7. 添加非4维的ReduceXXX支持 + +2023-5-6 +版本: v1.4.5b3: +更新内容: +1. 增加RKNN_MIN_TIMEOUT_MS环境变量设置NPU提交任务超时的阈值 +2. 添加一维Where的支持 +3. 修复大模型包含Constant节点报错的问题 +4. 优化权重稀疏化的性能 + +2023-4-28 +版本: v1.4.5b2: +更新内容: +1. 修复dynamic_input普通api结果错误问题 +2. 修复非4维输入连板推理报错问题 + +2023-4-27 +版本: v1.4.5b1: +更新内容: +1. 修复dynamic_input连板推理输出shape报错问题 +2. 添加matmul前后transpose的消除规则, 并优化matmul性能 +3. 修复大模型编译报错问题 +4. 添加load_rknn的dynamic_input支持 +5. 修复代码生产时resize出错的问题 + +2023-4-26 +版本: v1.4.5b0: +更新内容: +1. [RK3562] 优化Transformer模型中的transpose/reshape多算子级联的性能 +2. 增加后缀为.torchscript的pytorch文件格式支持 + +2023-4-25 +版本: v1.4.4b5: +更新内容: +1. 修复dynamic_input在存在Reshape下的推理报错问题 +2. 增加dynamic_input多轴动态支持 +3. 更新cpp部署代码生成功能 + +2023-4-23 +版本: v1.4.4b3: +更新内容: +1. 添加dynamic_input功能 +2. 修复3维deconv报错问题 +3. 更新大模型转换支持 +4. 优化模拟器推理性能 +5. 添加cpp部署代码生成功能 +6. 修复load_rknn的推理问题 + +2023-4-14 +版本: v1.4.3b12: +更新内容: +1. [RK3562]增加指定层跑CPU/GPU/NPU特性。 +2. 修复concat优化规则 +3. 添加op_target功能 + +2023-4-11 +版本: v1.4.3b10: +更新内容: +1. 更新rknn编译器 + +2023-4-10 +版本: v1.4.3b9: +更新内容: +1. 更新tensorflow QAT支持 +2. 优化大模型的转换内存和性能 +3. 修复图优化问题,并添加部分新规则 +4. 修复mmse报错问题 +5. 优化conv的拆分规则 +6. 修复混合量化问题 +7. 添加RMSNorm支持 +8. load_onnx添加input_initial_val参数 +9. 修复onnxoptimizer报错问题 + +2023-3-28 +版本: v1.4.3b4: +更新内容: +1. 修复5维slice的问题 + +2023-3-27 +版本: v1.4.3b3: +更新内容: +1. [RK3566]优化CNN+LSTM结构模型的内存。 +2. 优化Concat性能 +3. load_tflite/load_tensorflow添加input_is_nchw参数 + +2023-3-23 +版本: v1.4.3b2: +更新内容: +1. mul/add/div/sub算子优化。 +2. 修复多级maxpool量化问题 + +2023-3-21 +版本: v1.4.3b1: +更新内容: +1. 修复Expand算子Bug + +2023-3-21 +版本: v1.4.3b0: +更新内容: +1. [RK3562]增加内部Buffer循环复用功能 +2. [RK3562]优化多batch layerNorm算子精度 +3. [RK3566]int8 Matmul CPU算子优化 +4. [全平台]expand NPU OP支持 +5. [全平台]fp16模型输入耗时优化 +6. 完善Cast算子的支持 +7. 修复remove_weight/多输入归一化参数匹配错误等Bug +8. 更新常量折叠支持 +9. 更新eval_perf功能 +10. 增加float16模型的支持 +11. 优化常量共享的模型 + +2023-3-9 +版本: v1.4.2b6: +更新内容: +1. RK3562平台Bug修复。 +2. 增加model_pruning控制,并支持deconv,以及Bug修复 +3. 增加If/Loop的部分转换支持 +4. 修复MMSE部分模型失败的问题 +5. 优化仿真器的结果 +6. 增加python3.10的支持 +7. 优化转换内存占用 +8. 增加部分非4维Op支持 + +2023-2-15 +版本: v1.4.2b1: +更新内容: +1. 修复RK3562查询的size_with_stride大小错误问题 + +2023-2-14 +版本: v1.4.2b0: +更新内容: +1. 更新neg支持 +2. 增加min/max的融合优化 +3. 增加了RK3562平台支持 + +2023-2-8 +版本: v1.4.1b23: +更新内容: +1. 修复特定stride反卷积算子的Bug。 +2. 更新MatMul的perchannel量化支持 +3. 更新动态图检测功能 +4. 优化where的量化支持 + +2023-2-2 +版本: v1.4.1b22: +更新内容: +1. 增加Equal算子对Bool类型支持。 +2. 修复Matmul算子/exLayerNorm算子的Bug. +3. 更新equal/slice/cast/pad/ConvTranspose支持 +4. 更新QAT模型支持 +5. 移除bfloat16包依赖 + +2023-1-13 +版本: v1.4.1b21: +更新内容: +1. 修复RK3588 Matmul接口错误。 +2. 修复4通道输入float16类型模型在RK356X平台查询虚宽错误问题。 +3. 模型不填写量化信息情况下,默认Tensor量化类型为float16。 +4. 增加unk__xxx无效shape支持 +5. 更新abs/dataconvert支持 +6. 优化模型剪枝功能 + +2023-1-6 +版本: v1.4.1b19: +更新内容: +1. [功能]增加Conv+Add+Relu子图融合。 +2. 修复Conv+Add在量化参数不一致情况下融合的Bug。 +3. 修复RK3588 大kernel卷积的Bug。 +4. 增加模型剪枝功能 +5. 优化Sigmoid的量化参数 +6. 增加rk3562的支持 + +2022-12-17 +版本: v1.4.1b17: +更新内容: +1. [优化]增加NPU输出NCHW数据支持。 +2. [功能]增加conv+add+relu融合支持。 +3. 修复最高维度非1模型MaxPool算子错误的Bug。 +4. 修复最高维度非1模型首层Conv错误的Bug。 +5. 修改4维npy的layout定义 +6. 优化dataconvert/gather/transpose/mul/maxpool/sigmoid/pad/conv/relu/softmax支持 +7. 增加aten::upsample_nearest2d支持 +8. 修复仿真器在perchannel下可能的溢出问题 +9. 增加更多的转换错误提示 +10. 更新混合量化支持 + +2022-11-26 +版本: v1.4.1b14: +更新内容: +1. 修复寄存器位宽限制警告。 +2. 优化Concat CPU算子效率。 +3. 增加2维layernorm支持 +4. 更新MatMul支持 + +2022-11-19 +版本: v1.4.1b13: +更新内容: +1. [重要]Android NDK编译器升级到r23b版本,APP建议使用该版本NDK重新编译。 +2. LSTM结构更新升级,需要重新转换模型。 +3. RK356X增加Transpose优化。 +4. RK356X模型非对齐通道的float类型NCHW输出效率优化。 +5. 增加常量输出节点删除功能 +6. MMSE支持无法batch扩维的模型 +7. 修复resize/clip缺失属性的问题 +8. 增加swish/dataconvert/softmax/lstm/layernorm相关优化 +9. 增加离群值检测功能 +10. 优化非4维OP的性能 + +2022-11-01 +版本: v1.4.1b12: +更新内容: +1.修复LSTM模型多次转换结果不一致问题。 +2.改进onnx模型裁剪功能 + +2022-10-29 +版本: v1.4.1b11: +更新内容: +1.修复Runtime外部分配内接口运行LSTM错误问题。 +2.修复Runtime rknn_dup_context接口运行LSTM错误问题。 +3.优化大模型转换性能 +4.添加Loop/Scatter转换支持 + +2022-10-24 +版本: v1.4.1b10: +更新内容: +1.修复LSTM兼容性问题。 +2.修复RK3588输入自动填充虚宽值的重复运行错误的bug。 +3.修复出现size=0的中间tensor刷cache失败的问题(模型需重新生成)。 +4.增加IN、Swish非4维支持 +5.添加tflite支持perchannel的QAT模型 + +2022-10-19 +版本: v1.4.1b9: +更新内容: +1.修复RV1106 rknn_detroy接口内存泄漏问题。 + +2022-10-18 +版本: v1.4.1b8: +更新内容: +1.修复非LSTM模型共享权重时rknn_init失败的bug。 + +2022-10-17 +版本: v1.4.1b7: +更新内容: +1.修复RK3588分支合并后的bug。 + +2022-10-17 +版本: v1.4.1b6: +更新内容: +1.修复大分辨率输入的bug。 +2.优化无效pad + +2022-10-13 +版本: v1.4.1b5: +更新内容: +1.修复32-bit库matmul错误的bug。 +2.添加FAQ文档 +3.更新图优化规则 +4.调节MatMul量化方式 + +2022-10-12 +版本: v1.4.1b4: +更新内容: +1.修复LSTM共享权重失败问题。 +2.更新图优化规则 + +2022-10-10 +版本: v1.4.1b3: +更新内容: +1. LSTM寄存器配置内存占用的优化。 +2. 优化MMSE量化算法 +3. 优化KL量化算法 + +2022-9-30 +版本: v1.4.1b2: +更新内容: +1. 关闭寄存器差量支持 +2. 增加Batchnorm+Relu融合支持 +3. 增加32-bit Runtime库Neon优化支持。 +4. 优化rknn_init空初始化性能。 +5. 更新精度分析功能 +6. 修复QAT模型的hardsigmoid等问题 +7. 修复lstm/gru图优化问题 +8. 更新图优化规则 + +2022-9-14 +版本: v1.4.1b1: +更新内容: +1. 增加寄存器差量支持 +2. 修复lstm的bug + +2022-9-14 +版本: v1.4.1b0: +更新内容: +1. 增加rknn.config接口增加npu_do_output_nhwc配置,开启或关闭NPU直接输出NHWC的特性 +2. 修复QAT模型解析问题 + + +------------------------------------------------------------ +2022-8-20 +版本: v1.4.0: +更新内容: +1. 升级相关依赖包到主流版本 +2. 添加更多2/3/5维度的Op支持 +3. 更新config/init_runtime等接口 +4. 更新LSTM等Op支持 +5. 添加yuv输入支持 +6. 更新QAT模型支持 + +2022-7-2 +版本: v1.3.4b5: +更新内容: +1. rknn-toolkit2: + 1) optimize_onnx接口 + a. 在设置optimization_level=2时,关闭conv+add融合。 + b. 保留BatchNormalize算子带的量化参数。 + 2) RK3588屏蔽NPU直接输出NHWC layout的支持, RK3566/RV1106保留该功能。 +2. C API: + 1) RK3588/RK3566/RV1106支持传入一个包含rknn模型的大文件路径,rknn_init接口设置包含偏移和真实rknn模型大小的rknn_init_extend结构体指针。 + + +------------------------------------------------------------ +2021-4-22 +版本: v1.3.0: +更新内容: +1. 新功能: python3.8/ubuntu20.04 平台支持 +2. 修复一些已知的bug: + 1) 修复了一些图优化和量化bug + +2021-4-7 +版本: v1.2.5: +更新内容: +1. 新功能: rv1103/rv1109平台的支持. +2. 修复一些已知的bug: + 1) 修复了一些QAT模型转换问题 + 2) 修复了一些图优化bug + + +2021-1-27 +版本: v1.2.1-beta: +更新内容: +1. 新功能: 多batch的NHWC格式输入时,在H维度,有效元素个数与实际内存中的元素个数不一致时,支持H方向实际元素个数按照h_stride设置. +2. 修复一些已知的bug: + 1) LSTM算子内部变量重名的问题. + + +------------------------------------------------------------ +2021-1-12 +版本:v1.2.0 +更新内容: +1. 新功能: rk3588平台的支持; rknn模型加密支持; tensorflow/tflite/pytorch量化感知模型支持; 增加了一些新的 op 支持: InstanceNormalization, Swish, Conv1D等(详见 op support list);增加了参数量计算以及算力分析 +2. examples 更新:增加了从 pytorch 转 onnx 的转换 demo:resnet18_export_onnx ;增加了pytorch量化感知模型的加载demo:resnet18_qat demo;增加了模型加密功能:添加了3588平台 rknn 转换 demo +3. 接口更改:移除了 config,load_caffe,load_tensorflow等接口的一些不必要的参数设置,更新了 eval_perf 接口,详细改动见Uer_Guide文档 +4. 修复一些已知的bug: + 1) 修复了一些模型无法转换rknn的问题 + 2) 修复了一些图优化bug + 3) 修复了一些模型推理结果错误的问题 + 4) 修复了 pytorch、tflite 某些 op 转换失败的问题 +5. 优化: 精度分析耗时优化; 模型转换和量化耗时优化 + + +------------------------------------------------------------ +2021-8-12 +版本:v1.1.0 +更新内容: +1. 新功能: LSTM,GRU的支持;增加了accuracy_analysis对比项目;增加了一些op支持:caffe hardswish;onnx gather,reduceMax等op;更新了更全面的图优化规则。 +2. examples更新:增加了yolov5的demo +3. 修复一些已知的bug: + 1)修复了一些模拟器的推理结果错误问题 + 2)修复了一些图优化bug + 3)修复了一些大模型无法转换rknn的问题 + 4)修复了多输入的转换和推理bug +4. 更新了文档,更新了OP支持列表 + +2021-6-30 +版本:v1.1.0beta +更新内容: +1. 新功能: 混合量化功能(支持自定义是否量化以及量化参数修改);完善了 accuracy_analysis 对比功能(包括连板对比结果) +2. examples更新:增加了常用接口的demo示例:accuracy_analysis、batch_size、hybrid_quant、load_quantized_model、mmse、multi_input_test +3. 修复一些已知的bug: + 1)修复了一些int8/fp16模型的转换问题以及op精度问题 + 2)修复了一些图优化bug,修复了一些依赖的版本问题 +4. 更新了文档,更新了OP支持列表 + + +------------------------------------------------------------ +2021-4-30 +版本:v1.0.0 +更新内容: +1. 新功能: 卷积类的per channel量化功能;添加了config中custom_inf的模型信息设置、img_quant_RGB2BGR设置;添加了eval performance的性能测试接口;增加了版本打印功能 +2. OP支持:1) 添加了Caffe新OP支持:Power/Tile/Eltwise(Max)/去除了normalize维度的限制; 2) 添加了onnx新OP支持:HardSigmoid/Pow/Tile +3. 修复一些已知的bug: + 1) 修复了caffe FC的输出shape以及name的错误 + 2) 优化了mmse的量化性能 + 3)修复caffe的Pooling层的输出shape计算错误 + 4)修复了caffe slice丢弃了其中一个输出的inference bug + 5)修复了一些模型优化的bug +4. 弃置了reorder_channel的config设置,由用户自行保证inference输入数据的channel正确性 +5. 更新了文档,更新了OP支持列表 + + +------------------------------------------------------------ +2021-4-2 +版本:v0.7.0 +更新内容: +1. 新功能: 新的量化算法支持(mmse), 添加支持tensorflow的预量化模型导入 +2. 添加了Caffe新OP支持:relu6/ConvolutionDepthwise/Transpose/reorg +3. 修复一些已知的bug: + 1) 增加concat的非channel维度,非4维输入的支持 + 2) 修复了第一层是scale的预处理bug + 3)更新了onnxruntime==1.7.0的版本 +4. 更新了文档,更新了OP支持列表 + + +------------------------------------------------------------ +2021-3-1 +版本:v0.6.0 +更新内容: +1. 新功能: caffe load API添加指定输入name的接口;添加了caffe lrn(WithinChannel)的支持 +2. 添加了Caffe新OP支持:crop/flatten/normalize/proposal/reduction +3. 添加了onnx/pytorch/tensorflow/darknet/tflite新OP支持 +4. 移除了aciq以及Kl散度量化功能 +5. 修复一些已知的bug: + 1) 最后一层是reshape转换bug; + 2) 修复了caffe中InnerProduct随机生成blob的bug; + 3) 修复了过大的size导致GlobalAvgPool GlobalMaxPool crash的问题; + 4) 修复了第一层是RoIpooling的维度错误; + 5) 修复了SSD设备端推理错误的问题等。 +6. 更新了文档,增加了OP支持列表 diff --git a/doc/requirements_cp310-1.5.0.txt b/doc/requirements_cp310-1.5.0.txt new file mode 100644 index 0000000..7c4612e --- /dev/null +++ b/doc/requirements_cp310-1.5.0.txt @@ -0,0 +1,23 @@ +# if install failed, please change the pip source to 'https://mirror.baidu.com/pypi/simple' + +# base deps +numpy==1.23.4 +protobuf==3.20.3 +flatbuffers==2.0 + +# utils +requests==2.28.1 +psutil==5.9.0 +ruamel.yaml==0.17.21 +scipy==1.9.3 +tqdm==4.64.1 +opencv-python==4.5.5.64 +fast-histogram==0.11 + +# base +onnx==1.13.1 +onnxoptimizer==0.3.8 +onnxruntime==1.14.1 +torch==1.13.1 +torchvision==0.14.1 +tensorflow==2.8.0 \ No newline at end of file diff --git a/doc/requirements_cp36-1.4.0.txt b/doc/requirements_cp36-1.5.0.txt similarity index 91% rename from doc/requirements_cp36-1.4.0.txt rename to doc/requirements_cp36-1.5.0.txt index ae8cae8..3359cfe 100644 --- a/doc/requirements_cp36-1.4.0.txt +++ b/doc/requirements_cp36-1.5.0.txt @@ -11,11 +11,11 @@ psutil==5.9.0 ruamel.yaml==0.17.4 scipy==1.5.4 tqdm==4.64.0 -bfloat16==1.1 opencv-python==4.5.5.64 +fast-histogram==0.11 # base -onnx==1.9.0 +onnx==1.10.0 onnxoptimizer==0.2.7 onnxruntime==1.10.0 torch==1.10.1 diff --git a/doc/requirements_cp38-1.4.0.txt b/doc/requirements_cp38-1.5.0.txt similarity index 91% rename from doc/requirements_cp38-1.4.0.txt rename to doc/requirements_cp38-1.5.0.txt index ae8cae8..3359cfe 100644 --- a/doc/requirements_cp38-1.4.0.txt +++ b/doc/requirements_cp38-1.5.0.txt @@ -11,11 +11,11 @@ psutil==5.9.0 ruamel.yaml==0.17.4 scipy==1.5.4 tqdm==4.64.0 -bfloat16==1.1 opencv-python==4.5.5.64 +fast-histogram==0.11 # base -onnx==1.9.0 +onnx==1.10.0 onnxoptimizer==0.2.7 onnxruntime==1.10.0 torch==1.10.1 diff --git a/docker/docker_file/ubuntu_18_04_cp36/Dockerfile_ubuntu_18_04_for_cp36 b/docker/docker_file/ubuntu_18_04_cp36/Dockerfile_ubuntu_18_04_for_cp36 new file mode 100644 index 0000000..fea7658 --- /dev/null +++ b/docker/docker_file/ubuntu_18_04_cp36/Dockerfile_ubuntu_18_04_for_cp36 @@ -0,0 +1,26 @@ +FROM ubuntu:18.04 + +COPY sources_bionic.list /etc/apt/sources.list + +ENV DEBIAN_FRONTEND=noninteractive + +RUN apt-get update \ + && apt-get install -y python3 python3-dev python3-pip gcc vim libprotobuf-dev zlib1g zlib1g-dev libsm6 \ + && apt-get install -y libgl1 libglib2.0-0 android-tools-adb + +RUN cd /usr/bin \ + && ln -sfn idle3 idle \ + && ln -sfn pydoc3 pydoc \ + && ln -sfn python3 python \ + && ln -sfn python3-config python-config \ + && ln -sfn pip3 pip \ + && ls -al + +RUN python -m pip install --upgrade pip -i https://mirror.baidu.com/pypi/simple --trusted-host=mirror.baidu.com +RUN pip3 config set global.index-url https://mirror.baidu.com/pypi/simple +RUN pip3 config set install.trusted-host mirror.baidu.com + +RUN python3 --version +RUN pip3 --version +COPY rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl +RUN pip3 install rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl diff --git a/packages/rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl b/docker/docker_file/ubuntu_18_04_cp36/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl similarity index 83% rename from packages/rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl rename to docker/docker_file/ubuntu_18_04_cp36/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl index 66bcb19..6c4843b 100644 Binary files a/packages/rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl and b/docker/docker_file/ubuntu_18_04_cp36/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl differ diff --git a/docker/docker_file/ubuntu_18_04_cp36/sources_bionic.list b/docker/docker_file/ubuntu_18_04_cp36/sources_bionic.list new file mode 100644 index 0000000..43db986 --- /dev/null +++ b/docker/docker_file/ubuntu_18_04_cp36/sources_bionic.list @@ -0,0 +1,13 @@ +# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 +deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse +# deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse +deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse +# deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse +deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse +# deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse +deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse +# deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse + +# 预发布软件源,不建议启用 +# deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed main restricted universe multiverse +# deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed main restricted universe multiverse \ No newline at end of file diff --git a/examples/functions/board_test/test.py b/examples/functions/board_test/test.py index 4185edb..769ec22 100644 --- a/examples/functions/board_test/test.py +++ b/examples/functions/board_test/test.py @@ -89,7 +89,7 @@ def show_outputs(outputs): # eval perf print('--> Eval perf') - rknn.eval_perf(inputs=[img]) + rknn.eval_perf() # eval perf print('--> Eval memory') diff --git a/examples/functions/dynamic_input/dog_224x224.jpg b/examples/functions/dynamic_input/dog_224x224.jpg new file mode 100644 index 0000000..4f46457 Binary files /dev/null and b/examples/functions/dynamic_input/dog_224x224.jpg differ diff --git a/examples/functions/dynamic_input/test.py b/examples/functions/dynamic_input/test.py new file mode 100644 index 0000000..3dc9798 --- /dev/null +++ b/examples/functions/dynamic_input/test.py @@ -0,0 +1,101 @@ +import numpy as np +import cv2 +from rknn.api import RKNN + + +def show_outputs(outputs): + output_ = outputs[0].reshape((-1, 1000)) + for output in output_: + output_sorted = sorted(output, reverse=True) + top5_str = 'mobilenet_v1\n-----TOP 5-----\n' + for i in range(5): + value = output_sorted[i] + index = np.where(output == value) + for j in range(len(index)): + if (i + j) >= 5: + break + if value > 0: + topi = '{}: {}\n'.format(index[j], value) + else: + topi = '-1: 0.0\n' + top5_str += topi + print(top5_str) + + +def show_perfs(perfs): + perfs = 'perfs: {}\n'.format(outputs) + print(perfs) + + +if __name__ == '__main__': + + # Create RKNN object + rknn = RKNN(verbose=True) + + # The multiple sets of input shapes specified by the user, to simulate the function of dynamic input. + # Please make sure the model can be dynamic when enable 'config.dynamic_input', and shape in dynamic_input are correctly! + dynamic_input = [ + [[1,3,224,224]], # set 0: [input0_224] + [[1,3,192,192]], # set 1: [input0_192] + [[1,3,160,160]], # set 2: [input0_160] + ] + + # Pre-process config + print('--> Config model') + rknn.config(mean_values=[103.94, 116.78, 123.68], std_values=[58.82, 58.82, 58.82], quant_img_RGB2BGR=True, dynamic_input=dynamic_input) + print('done') + + # Load model + print('--> Loading model') + ret = rknn.load_caffe(model='../../caffe/mobilenet_v2/mobilenet_v2.prototxt', + blobs='../../caffe/mobilenet_v2/mobilenet_v2.caffemodel') + if ret != 0: + print('Load model failed!') + exit(ret) + print('done') + + # Build model + print('--> Building model') + ret = rknn.build(do_quantization=True, dataset='../../caffe/mobilenet_v2/dataset.txt') + if ret != 0: + print('Build model failed!') + exit(ret) + print('done') + + # Export rknn model + print('--> Export rknn model') + ret = rknn.export_rknn('./mobilenet_v2.rknn') + if ret != 0: + print('Export rknn model failed!') + exit(ret) + print('done') + + # Init runtime environment + print('--> Init runtime environment') + ret = rknn.init_runtime() + if ret != 0: + print('Init runtime environment failed!') + exit(ret) + print('done') + + # Set inputs + img = cv2.imread('./dog_224x224.jpg') + + # Inference + print('--> Running model') + img2 = cv2.resize(img, (224,224)) + img2 = np.expand_dims(img2, 0) + img2 = np.transpose(img2, (0,3,1,2)) # [1,3,224,224] + outputs = rknn.inference(inputs=[img2], data_format=['nchw']) + np.save('./functions_dynamic_input_0.npy', outputs[0]) + show_outputs(outputs) + + img3 = cv2.resize(img, (160,160)) + img3 = np.expand_dims(img3, 0) + img3 = np.transpose(img3, (0,3,1,2)) # [1,3,160,160] + outputs = rknn.inference(inputs=[img3], data_format=['nchw']) + np.save('./functions_dynamic_input_1.npy', outputs[0]) + show_outputs(outputs) + print('done') + + rknn.release() diff --git a/examples/functions/mmse/test.py b/examples/functions/mmse/test.py index 72124a4..516736b 100644 --- a/examples/functions/mmse/test.py +++ b/examples/functions/mmse/test.py @@ -61,11 +61,11 @@ def show_outputs(outputs): f = open('./snapshot/error_analysis.txt') lines = f.readlines() - cos = lines[-1].split()[1] - if float(cos) >= 0.965: + cos = lines[-1].split()[2] + if float(cos) >= 0.963: print('cos = {}, mmse work!'.format(cos)) else: - print('cos = {} < 0.965, mmse abnormal!'.format(cos)) + print('cos = {} < 0.963, mmse abnormal!'.format(cos)) f.close() # Set inputs diff --git a/examples/functions/model_pruning/dataset.txt b/examples/functions/model_pruning/dataset.txt new file mode 100644 index 0000000..aa215cb --- /dev/null +++ b/examples/functions/model_pruning/dataset.txt @@ -0,0 +1 @@ +dog_224x224.jpg \ No newline at end of file diff --git a/examples/functions/model_pruning/dog_224x224.jpg b/examples/functions/model_pruning/dog_224x224.jpg new file mode 100644 index 0000000..4f46457 Binary files /dev/null and b/examples/functions/model_pruning/dog_224x224.jpg differ diff --git a/examples/functions/model_pruning/mobilenet.caffemodel b/examples/functions/model_pruning/mobilenet.caffemodel new file mode 100644 index 0000000..132e56b Binary files /dev/null and b/examples/functions/model_pruning/mobilenet.caffemodel differ diff --git a/examples/functions/model_pruning/mobilenet_deploy.prototxt b/examples/functions/model_pruning/mobilenet_deploy.prototxt new file mode 100644 index 0000000..ac6ec6b --- /dev/null +++ b/examples/functions/model_pruning/mobilenet_deploy.prototxt @@ -0,0 +1,2002 @@ +name: "MOBILENET" +# transform_param { +# scale: 0.017 +# mirror: false +# crop_size: 224 +# mean_value: [103.94,116.78,123.68] +# } +layer { + name: "data" + type: "Input" + top: "data" + input_param: { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } } +} +#input: "data" +#input_dim: 1 +#input_dim: 3 +#input_dim: 224 +#input_dim: 224 +layer { + name: "conv1" + type: "Convolution" + bottom: "data" + top: "conv1" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 32 + bias_term: false + pad: 1 + kernel_size: 3 + stride: 2 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv1/bn" + type: "BatchNorm" + bottom: "conv1" + top: "conv1" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv1/scale" + type: "Scale" + bottom: "conv1" + top: "conv1" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu1" + type: "ReLU" + bottom: "conv1" + top: "conv1" +} +layer { + name: "conv2_1/dw" + type: "Convolution" + bottom: "conv1" + top: "conv2_1/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 32 + bias_term: false + pad: 1 + kernel_size: 3 + group: 32 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv2_1/dw/bn" + type: "BatchNorm" + bottom: "conv2_1/dw" + top: "conv2_1/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv2_1/dw/scale" + type: "Scale" + bottom: "conv2_1/dw" + top: "conv2_1/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu2_1/dw" + type: "ReLU" + bottom: "conv2_1/dw" + top: "conv2_1/dw" +} +layer { + name: "conv2_1/sep" + type: "Convolution" + bottom: "conv2_1/dw" + top: "conv2_1/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 64 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv2_1/sep/bn" + type: "BatchNorm" + bottom: "conv2_1/sep" + top: "conv2_1/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv2_1/sep/scale" + type: "Scale" + bottom: "conv2_1/sep" + top: "conv2_1/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu2_1/sep" + type: "ReLU" + bottom: "conv2_1/sep" + top: "conv2_1/sep" +} +layer { + name: "conv2_2/dw" + type: "Convolution" + bottom: "conv2_1/sep" + top: "conv2_2/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 64 + bias_term: false + pad: 1 + kernel_size: 3 + group: 64 + engine: CAFFE + stride: 2 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv2_2/dw/bn" + type: "BatchNorm" + bottom: "conv2_2/dw" + top: "conv2_2/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv2_2/dw/scale" + type: "Scale" + bottom: "conv2_2/dw" + top: "conv2_2/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu2_2/dw" + type: "ReLU" + bottom: "conv2_2/dw" + top: "conv2_2/dw" +} +layer { + name: "conv2_2/sep" + type: "Convolution" + bottom: "conv2_2/dw" + top: "conv2_2/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 128 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv2_2/sep/bn" + type: "BatchNorm" + bottom: "conv2_2/sep" + top: "conv2_2/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv2_2/sep/scale" + type: "Scale" + bottom: "conv2_2/sep" + top: "conv2_2/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu2_2/sep" + type: "ReLU" + bottom: "conv2_2/sep" + top: "conv2_2/sep" +} +layer { + name: "conv3_1/dw" + type: "Convolution" + bottom: "conv2_2/sep" + top: "conv3_1/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 128 + bias_term: false + pad: 1 + kernel_size: 3 + group: 128 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv3_1/dw/bn" + type: "BatchNorm" + bottom: "conv3_1/dw" + top: "conv3_1/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv3_1/dw/scale" + type: "Scale" + bottom: "conv3_1/dw" + top: "conv3_1/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu3_1/dw" + type: "ReLU" + bottom: "conv3_1/dw" + top: "conv3_1/dw" +} +layer { + name: "conv3_1/sep" + type: "Convolution" + bottom: "conv3_1/dw" + top: "conv3_1/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 128 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv3_1/sep/bn" + type: "BatchNorm" + bottom: "conv3_1/sep" + top: "conv3_1/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv3_1/sep/scale" + type: "Scale" + bottom: "conv3_1/sep" + top: "conv3_1/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu3_1/sep" + type: "ReLU" + bottom: "conv3_1/sep" + top: "conv3_1/sep" +} +layer { + name: "conv3_2/dw" + type: "Convolution" + bottom: "conv3_1/sep" + top: "conv3_2/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 128 + bias_term: false + pad: 1 + kernel_size: 3 + group: 128 + engine: CAFFE + stride: 2 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv3_2/dw/bn" + type: "BatchNorm" + bottom: "conv3_2/dw" + top: "conv3_2/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv3_2/dw/scale" + type: "Scale" + bottom: "conv3_2/dw" + top: "conv3_2/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu3_2/dw" + type: "ReLU" + bottom: "conv3_2/dw" + top: "conv3_2/dw" +} +layer { + name: "conv3_2/sep" + type: "Convolution" + bottom: "conv3_2/dw" + top: "conv3_2/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 256 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv3_2/sep/bn" + type: "BatchNorm" + bottom: "conv3_2/sep" + top: "conv3_2/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv3_2/sep/scale" + type: "Scale" + bottom: "conv3_2/sep" + top: "conv3_2/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu3_2/sep" + type: "ReLU" + bottom: "conv3_2/sep" + top: "conv3_2/sep" +} +layer { + name: "conv4_1/dw" + type: "Convolution" + bottom: "conv3_2/sep" + top: "conv4_1/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 256 + bias_term: false + pad: 1 + kernel_size: 3 + group: 256 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv4_1/dw/bn" + type: "BatchNorm" + bottom: "conv4_1/dw" + top: "conv4_1/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv4_1/dw/scale" + type: "Scale" + bottom: "conv4_1/dw" + top: "conv4_1/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu4_1/dw" + type: "ReLU" + bottom: "conv4_1/dw" + top: "conv4_1/dw" +} +layer { + name: "conv4_1/sep" + type: "Convolution" + bottom: "conv4_1/dw" + top: "conv4_1/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 256 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv4_1/sep/bn" + type: "BatchNorm" + bottom: "conv4_1/sep" + top: "conv4_1/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv4_1/sep/scale" + type: "Scale" + bottom: "conv4_1/sep" + top: "conv4_1/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu4_1/sep" + type: "ReLU" + bottom: "conv4_1/sep" + top: "conv4_1/sep" +} +layer { + name: "conv4_2/dw" + type: "Convolution" + bottom: "conv4_1/sep" + top: "conv4_2/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 256 + bias_term: false + pad: 1 + kernel_size: 3 + group: 256 + engine: CAFFE + stride: 2 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv4_2/dw/bn" + type: "BatchNorm" + bottom: "conv4_2/dw" + top: "conv4_2/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv4_2/dw/scale" + type: "Scale" + bottom: "conv4_2/dw" + top: "conv4_2/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu4_2/dw" + type: "ReLU" + bottom: "conv4_2/dw" + top: "conv4_2/dw" +} +layer { + name: "conv4_2/sep" + type: "Convolution" + bottom: "conv4_2/dw" + top: "conv4_2/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv4_2/sep/bn" + type: "BatchNorm" + bottom: "conv4_2/sep" + top: "conv4_2/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv4_2/sep/scale" + type: "Scale" + bottom: "conv4_2/sep" + top: "conv4_2/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu4_2/sep" + type: "ReLU" + bottom: "conv4_2/sep" + top: "conv4_2/sep" +} +layer { + name: "conv5_1/dw" + type: "Convolution" + bottom: "conv4_2/sep" + top: "conv5_1/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_1/dw/bn" + type: "BatchNorm" + bottom: "conv5_1/dw" + top: "conv5_1/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_1/dw/scale" + type: "Scale" + bottom: "conv5_1/dw" + top: "conv5_1/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_1/dw" + type: "ReLU" + bottom: "conv5_1/dw" + top: "conv5_1/dw" +} +layer { + name: "conv5_1/sep" + type: "Convolution" + bottom: "conv5_1/dw" + top: "conv5_1/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_1/sep/bn" + type: "BatchNorm" + bottom: "conv5_1/sep" + top: "conv5_1/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_1/sep/scale" + type: "Scale" + bottom: "conv5_1/sep" + top: "conv5_1/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_1/sep" + type: "ReLU" + bottom: "conv5_1/sep" + top: "conv5_1/sep" +} +layer { + name: "conv5_2/dw" + type: "Convolution" + bottom: "conv5_1/sep" + top: "conv5_2/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_2/dw/bn" + type: "BatchNorm" + bottom: "conv5_2/dw" + top: "conv5_2/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_2/dw/scale" + type: "Scale" + bottom: "conv5_2/dw" + top: "conv5_2/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_2/dw" + type: "ReLU" + bottom: "conv5_2/dw" + top: "conv5_2/dw" +} +layer { + name: "conv5_2/sep" + type: "Convolution" + bottom: "conv5_2/dw" + top: "conv5_2/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_2/sep/bn" + type: "BatchNorm" + bottom: "conv5_2/sep" + top: "conv5_2/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_2/sep/scale" + type: "Scale" + bottom: "conv5_2/sep" + top: "conv5_2/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_2/sep" + type: "ReLU" + bottom: "conv5_2/sep" + top: "conv5_2/sep" +} +layer { + name: "conv5_3/dw" + type: "Convolution" + bottom: "conv5_2/sep" + top: "conv5_3/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_3/dw/bn" + type: "BatchNorm" + bottom: "conv5_3/dw" + top: "conv5_3/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_3/dw/scale" + type: "Scale" + bottom: "conv5_3/dw" + top: "conv5_3/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_3/dw" + type: "ReLU" + bottom: "conv5_3/dw" + top: "conv5_3/dw" +} +layer { + name: "conv5_3/sep" + type: "Convolution" + bottom: "conv5_3/dw" + top: "conv5_3/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_3/sep/bn" + type: "BatchNorm" + bottom: "conv5_3/sep" + top: "conv5_3/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_3/sep/scale" + type: "Scale" + bottom: "conv5_3/sep" + top: "conv5_3/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_3/sep" + type: "ReLU" + bottom: "conv5_3/sep" + top: "conv5_3/sep" +} +layer { + name: "conv5_4/dw" + type: "Convolution" + bottom: "conv5_3/sep" + top: "conv5_4/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_4/dw/bn" + type: "BatchNorm" + bottom: "conv5_4/dw" + top: "conv5_4/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_4/dw/scale" + type: "Scale" + bottom: "conv5_4/dw" + top: "conv5_4/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_4/dw" + type: "ReLU" + bottom: "conv5_4/dw" + top: "conv5_4/dw" +} +layer { + name: "conv5_4/sep" + type: "Convolution" + bottom: "conv5_4/dw" + top: "conv5_4/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_4/sep/bn" + type: "BatchNorm" + bottom: "conv5_4/sep" + top: "conv5_4/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_4/sep/scale" + type: "Scale" + bottom: "conv5_4/sep" + top: "conv5_4/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_4/sep" + type: "ReLU" + bottom: "conv5_4/sep" + top: "conv5_4/sep" +} +layer { + name: "conv5_5/dw" + type: "Convolution" + bottom: "conv5_4/sep" + top: "conv5_5/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_5/dw/bn" + type: "BatchNorm" + bottom: "conv5_5/dw" + top: "conv5_5/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_5/dw/scale" + type: "Scale" + bottom: "conv5_5/dw" + top: "conv5_5/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_5/dw" + type: "ReLU" + bottom: "conv5_5/dw" + top: "conv5_5/dw" +} +layer { + name: "conv5_5/sep" + type: "Convolution" + bottom: "conv5_5/dw" + top: "conv5_5/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_5/sep/bn" + type: "BatchNorm" + bottom: "conv5_5/sep" + top: "conv5_5/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_5/sep/scale" + type: "Scale" + bottom: "conv5_5/sep" + top: "conv5_5/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_5/sep" + type: "ReLU" + bottom: "conv5_5/sep" + top: "conv5_5/sep" +} +layer { + name: "conv5_6/dw" + type: "Convolution" + bottom: "conv5_5/sep" + top: "conv5_6/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 512 + bias_term: false + pad: 1 + kernel_size: 3 + group: 512 + engine: CAFFE + stride: 2 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_6/dw/bn" + type: "BatchNorm" + bottom: "conv5_6/dw" + top: "conv5_6/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_6/dw/scale" + type: "Scale" + bottom: "conv5_6/dw" + top: "conv5_6/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_6/dw" + type: "ReLU" + bottom: "conv5_6/dw" + top: "conv5_6/dw" +} +layer { + name: "conv5_6/sep" + type: "Convolution" + bottom: "conv5_6/dw" + top: "conv5_6/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 1024 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv5_6/sep/bn" + type: "BatchNorm" + bottom: "conv5_6/sep" + top: "conv5_6/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv5_6/sep/scale" + type: "Scale" + bottom: "conv5_6/sep" + top: "conv5_6/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu5_6/sep" + type: "ReLU" + bottom: "conv5_6/sep" + top: "conv5_6/sep" +} +layer { + name: "conv6/dw" + type: "Convolution" + bottom: "conv5_6/sep" + top: "conv6/dw" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 1024 + bias_term: false + pad: 1 + kernel_size: 3 + group: 1024 + engine: CAFFE + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv6/dw/bn" + type: "BatchNorm" + bottom: "conv6/dw" + top: "conv6/dw" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv6/dw/scale" + type: "Scale" + bottom: "conv6/dw" + top: "conv6/dw" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu6/dw" + type: "ReLU" + bottom: "conv6/dw" + top: "conv6/dw" +} +layer { + name: "conv6/sep" + type: "Convolution" + bottom: "conv6/dw" + top: "conv6/sep" + param { + lr_mult: 1 + decay_mult: 1 + } + convolution_param { + num_output: 1024 + bias_term: false + pad: 0 + kernel_size: 1 + stride: 1 + weight_filler { + type: "msra" + } + } +} +layer { + name: "conv6/sep/bn" + type: "BatchNorm" + bottom: "conv6/sep" + top: "conv6/sep" + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + param { + lr_mult: 0 + decay_mult: 0 + } + batch_norm_param { + use_global_stats: true + eps: 1e-5 + } +} +layer { + name: "conv6/sep/scale" + type: "Scale" + bottom: "conv6/sep" + top: "conv6/sep" + param { + lr_mult: 1 + decay_mult: 0 + } + param { + lr_mult: 1 + decay_mult: 0 + } + scale_param { + filler { + value: 1 + } + bias_term: true + bias_filler { + value: 0 + } + } +} +layer { + name: "relu6/sep" + type: "ReLU" + bottom: "conv6/sep" + top: "conv6/sep" +} +layer { + name: "pool6" + type: "Pooling" + bottom: "conv6/sep" + top: "pool6" + pooling_param { + pool: AVE + global_pooling: true + } +} +layer { + name: "fc7" + type: "Convolution" + bottom: "pool6" + top: "fc7" + param { + lr_mult: 1 + decay_mult: 1 + } + param { + lr_mult: 2 + decay_mult: 0 + } + convolution_param { + num_output: 1000 + kernel_size: 1 + weight_filler { + type: "msra" + } + bias_filler { + type: "constant" + value: 0 + } + } +} +layer { + name: "prob" + type: "Softmax" + bottom: "fc7" + top: "prob" +} diff --git a/examples/functions/model_pruning/test.py b/examples/functions/model_pruning/test.py new file mode 100644 index 0000000..c474528 --- /dev/null +++ b/examples/functions/model_pruning/test.py @@ -0,0 +1,94 @@ +import numpy as np +import cv2 +from rknn.api import RKNN + + +def show_outputs(outputs): + np.save('./functions_model_pruning_0.npy', outputs[0]) + output = outputs[0].reshape(-1) + output_sorted = sorted(output, reverse=True) + top5_str = 'mobilenet\n-----TOP 5-----\n' + for i in range(5): + value = output_sorted[i] + index = np.where(output == value) + for j in range(len(index)): + if (i + j) >= 5: + break + if value > 0: + topi = '{}: {}\n'.format(index[j], value) + else: + topi = '-1: 0.0\n' + top5_str += topi + print(top5_str) + + +if __name__ == '__main__': + + # Create RKNN object + rknn = RKNN(verbose=True) + + # Pre-process config + print('--> Config model') + rknn.config(mean_values=[103.94, 116.78, 123.68], std_values=[58.82, 58.82, 58.82], quant_img_RGB2BGR=True, model_pruning=True) + print('done') + + # Load model + print('--> Loading model') + ret = rknn.load_caffe(model='./mobilenet_deploy.prototxt', + blobs='./mobilenet.caffemodel') + if ret != 0: + print('Load model failed!') + exit(ret) + print('done') + + # Build model + print('--> Building model') + ret = rknn.build(do_quantization=True, dataset='./dataset.txt') + if ret != 0: + print('Build model failed!') + exit(ret) + print('done') + + # Tips + print('') + print('======================================== Tips ==========================================================') + print('When verbose is set to True, the following similar prompts will appear during the build process, ') + print('indicating that model pruning has been effective for this model. (This means that approximately 6.9% ') + print('of the weights have been removed, resulting in a saving of about 13.4% of the computational workload.)') + print('Please note that not all models can be pruned, only models with sparse weights are likely to benefit from pruning.') + print('') + print(' I model_pruning ...') + print(' I model_pruning results:') + print(' I -1.12144 MB (-6.9%)') + print(' I -0.00016 T (-13.4%)') + print(' I model_pruning done.') + print('') + print('=========================================+++++++========================================================') + print('') + + # Export rknn model + print('--> Export rknn model') + ret = rknn.export_rknn('./mobilenet.rknn') + if ret != 0: + print('Export rknn model failed!') + exit(ret) + print('done') + + # Set inputs + img = cv2.imread('./dog_224x224.jpg') + + # Init runtime environment + print('--> Init runtime environment') + ret = rknn.init_runtime() + if ret != 0: + print('Init runtime environment failed!') + exit(ret) + print('done') + + # Inference + print('--> Running model') + outputs = rknn.inference(inputs=[img]) + show_outputs(outputs) + print('done') + + rknn.release() diff --git a/examples/functions/multi_input_test/input2.npy b/examples/functions/multi_input_test/input2.npy index a614faf..2025dfb 100644 Binary files a/examples/functions/multi_input_test/input2.npy and b/examples/functions/multi_input_test/input2.npy differ diff --git a/examples/functions/multi_input_test/input3.npy b/examples/functions/multi_input_test/input3.npy index a67bafe..ff70fbe 100644 Binary files a/examples/functions/multi_input_test/input3.npy and b/examples/functions/multi_input_test/input3.npy differ diff --git a/examples/functions/multi_input_test/test.py b/examples/functions/multi_input_test/test.py index 0d3b7af..3d669ae 100644 --- a/examples/functions/multi_input_test/test.py +++ b/examples/functions/multi_input_test/test.py @@ -50,18 +50,18 @@ # Set inputs img = cv2.imread('./dog_128x128.jpg') - img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) + img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # nhwc img_gray = cv2.imread('./dog_128x128_gray.png', cv2.IMREAD_GRAYSCALE) - img_gray = np.expand_dims(img_gray, -1) + img_gray = np.expand_dims(img_gray, -1) # nhwc - input2 = np.load('input2.npy').astype('float32') + input2 = np.load('input2.npy').astype('float32') # nchw - input3 = np.load('input3.npy').astype('float32') + input3 = np.load('input3.npy').astype('float32') # nchw # Inference print('--> Running model') - outputs = rknn.inference(inputs=[img, input2, input3, img_gray]) + outputs = rknn.inference(inputs=[img, input2, input3, img_gray], data_format=['nhwc', 'nchw', 'nchw', 'nhwc']) np.save('./functions_multi_input_test_0.npy', outputs[0]) print('done') outputs[0] = outputs[0].reshape((1, -1)) diff --git a/examples/onnx/yolov5/test.py b/examples/onnx/yolov5/test.py index a1c9988..9e3eef0 100644 --- a/examples/onnx/yolov5/test.py +++ b/examples/onnx/yolov5/test.py @@ -309,9 +309,6 @@ def letterbox(im, new_shape=(640, 640), color=(0, 0, 0)): img_1 = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) if boxes is not None: draw(img_1, boxes, scores, classes) - # show output - # cv2.imshow("post process result", img_1) - # cv2.waitKey(0) - # cv2.destroyAllWindows() + cv2.imwrite('result.jpg', img_1) rknn.release() diff --git a/examples/onnx/yolov5/yolov5s.onnx b/examples/onnx/yolov5/yolov5s.onnx index 6165589..6f02d2a 100644 Binary files a/examples/onnx/yolov5/yolov5s.onnx and b/examples/onnx/yolov5/yolov5s.onnx differ diff --git a/examples/readme.txt b/examples/readme.txt index 5954171..90c0fc1 100644 --- a/examples/readme.txt +++ b/examples/readme.txt @@ -24,4 +24,6 @@ The directory structure of examples is as follows: ├── multi_input_test # multi-input float model ├── hybrid_quant # how to use hybrid-quantization function ├── mmse # how to use mmse function + ├── model_pruning # how to use model_pruning function + ├── dynamic_input # how to use dynamic_input function └── board_test # how to connect the board for debugging diff --git a/packages/md5sum.txt b/packages/md5sum.txt index 18977d9..599bd55 100644 --- a/packages/md5sum.txt +++ b/packages/md5sum.txt @@ -1,2 +1,3 @@ -58c3b61e199911146c81f3c85a4c686d rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl -c702ff263b54fa12b9044b1d334ce28a rknn_toolkit2-1.4.0_22dcfef4-cp38-cp38-linux_x86_64.whl +97d96fd25f537da3acc3a6adc4dbe2f4 rknn_toolkit2-1.5.0+1fa95b5c-cp310-cp310-linux_x86_64.whl +e43332655d9b86afe33bac95f28a5adc rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl +d6ba9436d85a39c5b6dfa045c0ec51d1 rknn_toolkit2-1.5.0+1fa95b5c-cp38-cp38-linux_x86_64.whl diff --git a/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp310-cp310-linux_x86_64.whl b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp310-cp310-linux_x86_64.whl new file mode 100644 index 0000000..ec1fa4d Binary files /dev/null and b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp310-cp310-linux_x86_64.whl differ diff --git a/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl new file mode 100644 index 0000000..6c4843b Binary files /dev/null and b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp36-cp36m-linux_x86_64.whl differ diff --git a/packages/rknn_toolkit2-1.4.0_22dcfef4-cp38-cp38-linux_x86_64.whl b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp38-cp38-linux_x86_64.whl similarity index 83% rename from packages/rknn_toolkit2-1.4.0_22dcfef4-cp38-cp38-linux_x86_64.whl rename to packages/rknn_toolkit2-1.5.0+1fa95b5c-cp38-cp38-linux_x86_64.whl index d67381c..e8283c0 100644 Binary files a/packages/rknn_toolkit2-1.4.0_22dcfef4-cp38-cp38-linux_x86_64.whl and b/packages/rknn_toolkit2-1.5.0+1fa95b5c-cp38-cp38-linux_x86_64.whl differ diff --git a/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_CN.pdf b/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_CN.pdf similarity index 55% rename from rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_CN.pdf rename to rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_CN.pdf index f46538f..efe123a 100644 Binary files a/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_CN.pdf and b/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_CN.pdf differ diff --git a/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_EN.pdf b/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_EN.pdf similarity index 55% rename from rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_EN.pdf rename to rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_EN.pdf index 6232a33..8b808dd 100644 Binary files a/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.4.0_EN.pdf and b/rknn_toolkit_lite2/docs/Rockchip_User_Guide_RKNN_Toolkit_Lite2_V1.5.0_EN.pdf differ diff --git a/rknn_toolkit_lite2/docs/change_log.txt b/rknn_toolkit_lite2/docs/change_log.txt index b383fd5..90593a8 100644 --- a/rknn_toolkit_lite2/docs/change_log.txt +++ b/rknn_toolkit_lite2/docs/change_log.txt @@ -1,3 +1,10 @@ +2023-05-18 +版本: v1.5.0 +1. 功能完善: + 1.1 新增对RK3562的支持; + 1.2 新增适配AARCH64 Python3.8, Python3.10的安装包; + 1.3 适配1.5.0版本NPU驱动。 + 2022-08-31 版本: v1.4.0 1. 功能完善: diff --git a/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3562.rknn b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3562.rknn new file mode 100644 index 0000000..48f5e2a Binary files /dev/null and b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3562.rknn differ diff --git a/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk356x.rknn b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3566_rk3568.rknn similarity index 99% rename from rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk356x.rknn rename to rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3566_rk3568.rknn index 5c8886a..6d68ad0 100644 Binary files a/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk356x.rknn and b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3566_rk3568.rknn differ diff --git a/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3588.rknn b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3588.rknn index a817d20..0a64b0c 100644 Binary files a/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3588.rknn and b/rknn_toolkit_lite2/examples/inference_with_lite/resnet18_for_rk3588.rknn differ diff --git a/rknn_toolkit_lite2/examples/inference_with_lite/test.py b/rknn_toolkit_lite2/examples/inference_with_lite/test.py index b4e67c8..aee9155 100644 --- a/rknn_toolkit_lite2/examples/inference_with_lite/test.py +++ b/rknn_toolkit_lite2/examples/inference_with_lite/test.py @@ -17,8 +17,10 @@ def get_host(): device_compatible_str = f.read() if 'rk3588' in device_compatible_str: host = 'RK3588' + elif 'rk3562' in device_compatible_str: + host = 'RK3562' else: - host = 'RK356x' + host = 'RK3566_RK3568' except IOError: print('Read device node {} failed.'.format(DEVICE_COMPATIBLE_NODE)) exit(-1) @@ -28,8 +30,9 @@ def get_host(): INPUT_SIZE = 224 -RK356X_RKNN_MODEL = 'resnet18_for_rk356x.rknn' +RK3566_RK3568_RKNN_MODEL = 'resnet18_for_rk3566_rk3568.rknn' RK3588_RKNN_MODEL = 'resnet18_for_rk3588.rknn' +RK3562_RKNN_MODEL = 'resnet18_for_rk3562.rknn' def show_top5(result): @@ -55,8 +58,10 @@ def show_top5(result): if __name__ == '__main__': host_name = get_host() - if host_name == 'RK356x': - rknn_model = RK356X_RKNN_MODEL + if host_name == 'RK3566_RK3568': + rknn_model = RK3566_RK3568_RKNN_MODEL + elif host_name == 'RK3562': + rknn_model = RK3562_RKNN_MODEL elif host_name == 'RK3588': rknn_model = RK3588_RKNN_MODEL else: diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp37-cp37m-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp37-cp37m-linux_aarch64.whl deleted file mode 100644 index 95c9cf9..0000000 Binary files a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp37-cp37m-linux_aarch64.whl and /dev/null differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl deleted file mode 100644 index 5b9dc97..0000000 Binary files a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl and /dev/null differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp310-cp310-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp310-cp310-linux_aarch64.whl new file mode 100644 index 0000000..5c85f8a Binary files /dev/null and b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp310-cp310-linux_aarch64.whl differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp37-cp37m-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp37-cp37m-linux_aarch64.whl new file mode 100644 index 0000000..b2edda4 Binary files /dev/null and b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp37-cp37m-linux_aarch64.whl differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp38-cp38-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp38-cp38-linux_aarch64.whl new file mode 100644 index 0000000..dbe1f33 Binary files /dev/null and b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp38-cp38-linux_aarch64.whl differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp39-cp39-linux_aarch64.whl b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp39-cp39-linux_aarch64.whl new file mode 100644 index 0000000..640dce3 Binary files /dev/null and b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2-1.5.0-cp39-cp39-linux_aarch64.whl differ diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.4.0_packages.md5sum b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.4.0_packages.md5sum deleted file mode 100644 index a91b086..0000000 --- a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.4.0_packages.md5sum +++ /dev/null @@ -1,2 +0,0 @@ -050e79c9683a7b063bbce2dafc37bb20 rknn_toolkit_lite2-1.4.0-cp37-cp37m-linux_aarch64.whl -a3da64f0b2fb6bcbc4dc39bb86f23517 rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl diff --git a/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.5.0_packages.md5sum b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.5.0_packages.md5sum new file mode 100644 index 0000000..a772cae --- /dev/null +++ b/rknn_toolkit_lite2/packages/rknn_toolkit_lite2_1.5.0_packages.md5sum @@ -0,0 +1,4 @@ +738ab6d0ef1cdb46fd0c04df5ae6e872 rknn_toolkit_lite2-1.5.0-cp310-cp310-linux_aarch64.whl +4ac9e4ed20f2b0795f75cf42de910f36 rknn_toolkit_lite2-1.5.0-cp37-cp37m-linux_aarch64.whl +627f0b7e97ce3f97393100d39522d0b9 rknn_toolkit_lite2-1.5.0-cp38-cp38-linux_aarch64.whl +e45437df0e365e33278a2d2871eeb393 rknn_toolkit_lite2-1.5.0-cp39-cp39-linux_aarch64.whl