diff --git a/docs/Changelog.md b/docs/Changelog.md index 02d86d66475..a836f3582f7 100644 --- a/docs/Changelog.md +++ b/docs/Changelog.md @@ -24771,7 +24771,6 @@ This version of the operator has been available since version 21 of the default must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization, a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. See QuantizeLinear for details on quantization granularity. - `x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing `int32`, there's no zero point (zero point is supposed to be 0). `zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same @@ -25375,7 +25374,6 @@ This version of the operator has been available since version 21 of the default The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`. - Saturation is done according to: - uint16: [0, 65535] - int16: [-32768, 32767] @@ -25383,12 +25381,9 @@ This version of the operator has been available since version 21 of the default - int8: [-128, 127] - uint4: [0, 15] - int4: [-8, 7] - For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details. - `y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type. - There are three supported quantization granularities, determined by the shape of `y_scale`. In all cases, `y_zero_point` must have the same shape as `y_scale`. - Per-tensor (per-layer) quantization: `y_scale` is a scalar. @@ -28229,6 +28224,277 @@ This version of the operator has been available since version 22 of the default
Constrain input and output types to float tensors.
+## Version 23 of the default ONNX operator set +### **Cast-23** + + The operator casts the elements of a given input tensor to a data type + specified by the 'to' argument and returns an output tensor of the same size in + the converted type. The 'to' argument must be one of the data types specified + in the 'DataType' enum field in the TensorProto message. + + Casting from string tensor in plain (e.g., "3.14" and "1000") and scientific numeric representations + (e.g., "1e-5" and "1E8") to float types is supported. For example, converting string "100.5" to an integer may + yield result 100. There are some string literals reserved for special floating-point values; + "+INF" (and "INF"), "-INF", and "NaN" are positive infinity, negative infinity, and not-a-number, respectively. + Any string which can exactly match "+INF" in a case-insensitive way would be mapped to positive infinite. Similarly, + this case-insensitive rule is applied to "INF" and "NaN". When casting from numeric tensors + to string tensors, plain floating-point representation (such as "314.15926") would be used. + Converting non-numerical-literal string such as "Hello World!" is an undefined behavior. Cases + of converting string representing floating-point arithmetic value, such as "2.718", to INT is an undefined behavior. + + Conversion from a numerical type to any numerical type is always allowed. + User must be aware of precision loss and value change caused by range difference between two types. + For example, a 64-bit float 3.1415926459 may be round to a 32-bit float 3.141592. Similarly, converting + an integer 36 to Boolean may produce 1 because we truncate bits which can't be stored in the targeted type. + + In more detail, the conversion among numerical types should follow these rules + if the destination type is not a float 8 type. + + * Casting from floating point to: + * floating point: +/- infinity if OOR (out of range). + * fixed point: undefined if OOR. + * bool: +/- 0.0 to False; all else to True. + * Casting from fixed point to: + * floating point: +/- infinity if OOR. (+ infinity in the case of uint) + * fixed point: when OOR, discard higher bits and reinterpret (with respect to two's complement representation for + signed types). For example, 200 (int16) -> -56 (int8). + * bool: zero to False; nonzero to True. + * Casting from bool to: + * floating point: `{1.0, 0.0}`. + * fixed point: `{1, 0}`. + * bool: no change. + + Float 8 type were introduced to speed up the training of + deep models. By default the conversion of a float *x* obeys + to the following rules. `[x]` means the value rounded to + the target mantissa width. + + | x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ | + |------|----|----|----|----| + | 0 | 0 | 0 | 0 | 0 | + |-0 | -0 | 0 | -0 | 0 | + | NaN | NaN | NaN | NaN | NaN | + | +/- Inf | +/- FLT_MAX | NaN | FLT_MAX | NaN | + | [x] > FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX | + | [x] < -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX | + | else | RNE | RNE | RNE | RNE | + + The behavior changes if the parameter 'saturate' is set to False. + The rules then become: + + | x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ | + |------|----|----|----|----| + | 0 | 0 | 0 | 0 | 0 | + |-0 | -0 | 0 | -0 | 0 | + | NaN | NaN | NaN | NaN | NaN | + | +/- Inf | NaN | NaN | +/- Inf | NaN | + | [x] > FLT_MAX | NaN | NaN | Inf | NaN | + | [x] < -FLT_MAX | NaN | NaN | -Inf | NaN | + | else | RNE | RNE | RNE | RNE | + +#### Version + +This version of the operator has been available since version 23 of the default ONNX operator set. + +#### Attributes + +
+
saturate : int (default is 1)
+
The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 conversion (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. All cases are fully described in two tables inserted in the operator description.
+
to : int (required)
+
The data type to which the elements of the input tensor are cast. Strictly must be one of the types from DataType enum in TensorProto
+
+ +#### Inputs + +
+
input (differentiable) : T1
+
Input tensor to be cast.
+
+ +#### Outputs + +
+
output (differentiable) : T2
+
Output tensor with the same shape as input with type specified by the 'to' argument
+
+ +#### Type Constraints + +
+
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
Constrain input types. Casting from complex is not supported.
+
T2 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
Constrain output types. Casting to complex is not supported.
+
+ +### **CastLike-23** + + The operator casts the elements of a given input tensor (the first input) to + the same data type as the elements of the second input tensor. + See documentation of the Cast operator for further details. + +#### Version + +This version of the operator has been available since version 23 of the default ONNX operator set. + +#### Attributes + +
+
saturate : int (default is 1)
+
The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 conversion (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. Please refer to operator Cast description for further details.
+
+ +#### Inputs + +
+
input (differentiable) : T1
+
Input tensor to be cast.
+
target_type (non-differentiable) : T2
+
The (first) input tensor will be cast to produce a tensor of the same type as this (second input) tensor.
+
+ +#### Outputs + +
+
output (differentiable) : T2
+
Output tensor produced by casting the first input tensor to have the same type as the second input tensor.
+
+ +#### Type Constraints + +
+
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
Constrain input types. Casting from complex is not supported.
+
T2 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
Constrain output types. Casting to complex is not supported.
+
+ +### **DequantizeLinear-23** + + The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the + full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` + must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization, + a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. + See QuantizeLinear for details on quantization granularity. + + `x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing + `int32`, there's no zero point (zero point is supposed to be 0). + `zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same + for consistency, and `x_scale` still determines the output type. + +#### Version + +This version of the operator has been available since version 23 of the default ONNX operator set. + +#### Attributes + +
+
axis : int (default is 1)
+
(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.
+
block_size : int (default is 0)
+
(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`
+
+ +#### Inputs (2 - 3) + +
+
x : T1
+
N-D quantized input tensor to be de-quantized.
+
x_scale : T2
+
Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.
+
x_zero_point (optional) : T1
+
Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.
+
+ +#### Outputs + +
+
y : T2
+
N-D full precision output tensor. It has same shape as input `x`.
+
+ +#### Type Constraints + +
+
T1 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(int32), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
The type of the inputs 'x_zero_point' and 'x'.
+
T2 : tensor(float), tensor(float16), tensor(bfloat16)
+
'x_scale' determines the output type.
+
+ +### **QuantizeLinear-23** + + The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the + low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization + granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`. + + Saturation is done according to: + - uint16: [0, 65535] + - int16: [-32768, 32767] + - uint8: [0, 255] + - int8: [-128, 127] + - uint4: [0, 15] + - int4: [-8, 7] + + For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details. + + `y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization + formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type. + + There are three supported quantization granularities, determined by the shape of `y_scale`. + In all cases, `y_zero_point` must have the same shape as `y_scale`. + - Per-tensor (per-layer) quantization: `y_scale` is a scalar. + - Per-axis quantization: The scale must be a 1-D tensor, with the length of the quantization axis. For an input shape + `(D0, ..., Di, ..., Dn)` and `axis=i`, `y_scale` is a 1-D tensor of length `Di`. + - Blocked quantization: The scale's shape is identical to the input's shape, except for one dimension, in which + blocking is performed. Given `x` shape `(D0, ..., Di, ..., Dn)`, `axis=i`, and block size `B`: `y_scale` shape is + `(D0, ..., ceil(Di/B), ..., Dn)`. + +#### Version + +This version of the operator has been available since version 23 of the default ONNX operator set. + +#### Attributes + +
+
axis : int (default is 1)
+
(Optional) The axis of the dequantizing dimension of the input tensor. Used only for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`. When the rank of the input is 1, per-tensor quantization is applied, rendering the axis unnecessary in this scenario.
+
block_size : int (default is 0)
+
(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`
+
output_dtype : int (default is 0)
+
(Optional) The output data type. If not supplied, the output data type is inferred from `y_zero_point` data type (`T2`). If neither `output_dtype` nor `y_zero_point` are supplied, output data type is uint8. If both `output_dtype` and `y_zero_point` are specified, `output_dtype` must be `T2`.
+
saturate : int (default is 1)
+
The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 quantization (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. All cases are fully described in two tables inserted in the operator description.
+
+ +#### Inputs (2 - 3) + +
+
x : T1
+
N-D full precision Input tensor to be quantized.
+
y_scale : T1
+
Scale for doing quantization to get `y`. For per-tensor/layer quantization the scale is a scalar, for per-axis quantization it is a 1-D Tensor and for blocked quantization it has the same shape as the input, except for one dimension in which blocking is performed.
+
y_zero_point (optional) : T2
+
Zero point for doing quantization to get `y`. Shape must match `y_scale`.Default is uint8 with zero point of 0 if it's not specified.
+
+ +#### Outputs + +
+
y : T2
+
N-D quantized output tensor. It has same shape as input `x`.
+
+ +#### Type Constraints + +
+
T1 : tensor(float), tensor(float16), tensor(bfloat16), tensor(int32)
+
The type of the input 'x'.
+
T2 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
+
The type of the input `y_zero_point` and the output `y`.
+
+ # ai.onnx.preview.training ## Version 1 of the 'ai.onnx.preview.training' operator set ### **ai.onnx.preview.training.Adagrad-1** diff --git a/docs/Operators.md b/docs/Operators.md index 32e063feee0..6c1ce4d088c 100644 --- a/docs/Operators.md +++ b/docs/Operators.md @@ -29,7 +29,7 @@ For an operator input/output's differentiability, it can be differentiable, |BitwiseNot|18| |BitwiseOr|18| |BitwiseXor|18| -|Cast|21, 19, 13, 9, 6, 1| +|Cast|23, 21, 19, 13, 9, 6, 1| |Ceil|13, 6, 1| |Col2Im|18| |Compress|11, 9| @@ -46,7 +46,7 @@ For an operator input/output's differentiability, it can be differentiable, |DFT|20, 17| |DeformConv|22, 19| |DepthToSpace|13, 11, 1| -|DequantizeLinear|21, 19, 13, 10| +|DequantizeLinear|23, 21, 19, 13, 10| |Det|22, 11| |Div|14, 13, 7, 6, 1| |Dropout|22, 13, 12, 10, 7, 6, 1| @@ -107,7 +107,7 @@ For an operator input/output's differentiability, it can be differentiable, |Pow|15, 13, 12, 7, 1| |QLinearConv|10| |QLinearMatMul|21, 10| -|QuantizeLinear|21, 19, 13, 10| +|QuantizeLinear|23, 21, 19, 13, 10| |RNN|22, 14, 7, 1| |RandomNormal|22, 1| |RandomNormalLike|22, 1| @@ -169,7 +169,7 @@ For an operator input/output's differentiability, it can be differentiable, |AffineGrid|20|20| |Bernoulli|22, 15|22| |BlackmanWindow|17|17| -|CastLike|21, 19, 15|21| +|CastLike|23, 21, 19, 15|23| |Celu|12|12| |CenterCropPad|18|18| |Clip|13, 12, 11, 6, 1|13| @@ -3400,9 +3400,9 @@ expect( #### Version -This version of the operator has been available since version 21 of the default ONNX operator set. +This version of the operator has been available since version 23 of the default ONNX operator set. -Other versions of this operator: 1, 6, 9, 13, 19 +Other versions of this operator: 1, 6, 9, 13, 19, 21 #### Attributes @@ -3430,9 +3430,9 @@ Other versions of this operator: 1, 15, 19 +Other versions of this operator: 15, 19, 21 #### Attributes @@ -3898,9 +3952,9 @@ Other versions of this operator: 15, -
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)
+
T1 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
Constrain input types. Casting from complex is not supported.
-
T2 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)
+
T2 : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
Constrain output types. Casting to complex is not supported.
@@ -7514,9 +7568,9 @@ expect(node, inputs=[x], outputs=[y], name="test_depthtospace_example") #### Version -This version of the operator has been available since version 21 of the default ONNX operator set. +This version of the operator has been available since version 23 of the default ONNX operator set. -Other versions of this operator:
10, 13, 19 +Other versions of this operator: 10, 13, 19, 21 #### Attributes @@ -7548,7 +7602,7 @@ Other versions of this operator: 10 -
T1 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(int32), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)
+
T1 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(int32), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
The type of the inputs 'x_zero_point' and 'x'.
T2 : tensor(float), tensor(float16), tensor(bfloat16)
'x_scale' determines the output type.
@@ -7808,6 +7862,34 @@ expect( +
+float4e2m1 + +```python +node = onnx.helper.make_node( + "DequantizeLinear", + inputs=["x", "x_scale", "x_zero_point"], + outputs=["y"], + axis=0, +) + +# scalar zero point and scale +x = make_tensor("x", TensorProto.FLOAT4E2M1, [5], [0, 1, -1, 1.5, -4]) +x_scale = np.float32(2) +x_zero_point = make_tensor("x_zero_point", TensorProto.FLOAT4E2M1, (1,), [0]) +y = np.array([0, 2, -2, 3, -8], dtype=np.float32) + +expect( + node, + inputs=[x, x_scale, x_zero_point], + outputs=[y], + name="test_dequantizelinear_float4e2m1", +) +``` + +
+ +
int16 @@ -20466,9 +20548,9 @@ for quant_type_name in ["uint8", "int8"]: #### Version -This version of the operator has been available since version 21 of the default ONNX operator set. +This version of the operator has been available since version 23 of the default ONNX operator set. -Other versions of this operator: 10, 13, 19 +Other versions of this operator: 10, 13, 19, 21 #### Attributes @@ -20506,7 +20588,7 @@ Other versions of this operator: 10
T1 : tensor(float), tensor(float16), tensor(bfloat16), tensor(int32)
The type of the input 'x'.
-
T2 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)
+
T2 : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), tensor(float4e2m1)
The type of the input `y_zero_point` and the output `y`.
@@ -20737,6 +20819,50 @@ expect(
+
+float4e2m1 + +```python +node = onnx.helper.make_node( + "QuantizeLinear", + inputs=["x", "y_scale", "y_zero_point"], + outputs=["y"], + axis=0, +) + +x = np.array( + [ + [0.0, 2.5, 4.8, 8.6], + [-30, -20, 6, 9], + [-0.0, -2.5, -4.8, -8.6], + ] +).astype(np.float32) + +y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32) +y_zero_point = make_tensor( + "y_zero_point", + TensorProto.FLOAT4E2M1, + y_scale.shape, + np.zeros_like(y_scale), +) +y = make_tensor( + "y", + TensorProto.FLOAT4E2M1, + x.shape, + [0, 1, 2, 4, -6, -6, 2, 3, 0, -0.5, -1, -2], +) + +expect( + node, + inputs=[x, y_scale, y_zero_point], + outputs=[y], + name="test_quantizelinear_float4e2m1", +) +``` + +
+ +
int16 diff --git a/docs/TestCoverage.md b/docs/TestCoverage.md index a6cf61752f1..a974fac2498 100644 --- a/docs/TestCoverage.md +++ b/docs/TestCoverage.md @@ -2350,6 +2350,10 @@ test_cases = [ ("INT4", "FLOAT"), ("INT4", "FLOAT16"), ("INT4", "INT8"), + ("FLOAT4E2M1", "FLOAT"), + ("FLOAT4E2M1", "FLOAT16"), + ("FLOAT", "FLOAT4E2M1"), + ("FLOAT16", "FLOAT4E2M1"), ] vect_float32_to_float8e4m3 = np.vectorize(float32_to_float8e4m3) @@ -2566,7 +2570,57 @@ for from_type, to_type in test_cases: output_type_proto = onnx.helper.make_tensor_type_proto( getattr(TensorProto, to_type), input_shape ) + elif from_type == "FLOAT4E2M1" or to_type == "FLOAT4E2M1": + np_fp32 = np.array( + [ + "0.48", + "0.25", + "1.05", + "-3.5", + "-8", + "9", + "1000000", + "1e-7", + "NaN", + "INF", + "+INF", + "-INF", + "-4", + "0.01", + "-0.0", + ], + dtype=np.float32, + ) + input_shape = (3, 5) + if from_type == "FLOAT": + input_values = np_fp32 + input = make_tensor( + "x", TensorProto.FLOAT, input_shape, input_values.tolist() + ) + elif from_type == "FLOAT16": + input_values = np_fp32.astype(np.float16).astype(np.float32) + input = make_tensor( + "x", TensorProto.FLOAT16, input_shape, input_values.tolist() + ) + elif from_type == "FLOAT4E2M1": + input = make_tensor( + "x", TensorProto.FLOAT4E2M1, input_shape, np_fp32.tolist() + ) + else: + raise ValueError( + f"Conversion from {from_type} to {to_type} is not tested." + ) + if to_type not in ("FLOAT", "FLOAT16", "FLOAT4E2M1"): + raise ValueError( + f"Conversion from {from_type} to {to_type} is not tested." + ) + expected = unpacked_float4e2m1_to_float32( + subbyte.float32_to_float4e2m1_unpacked(np_fp32) + ) + output = make_tensor( + "y", getattr(TensorProto, to_type), input_shape, expected.tolist() + ) elif from_type != "STRING": input = np.random.random_sample(shape).astype( helper.tensor_dtype_to_np_dtype(getattr(TensorProto, from_type)) @@ -5317,7 +5371,7 @@ expect(node, inputs=[x], outputs=[y], name="test_depthtospace_example") ### DequantizeLinear -There are 11 test cases, listed as following: +There are 12 test cases, listed as following:
axis @@ -5554,6 +5608,32 @@ expect( ) ``` +
+
+float4e2m1 + +```python +node = onnx.helper.make_node( + "DequantizeLinear", + inputs=["x", "x_scale", "x_zero_point"], + outputs=["y"], + axis=0, +) + +# scalar zero point and scale +x = make_tensor("x", TensorProto.FLOAT4E2M1, [5], [0, 1, -1, 1.5, -4]) +x_scale = np.float32(2) +x_zero_point = make_tensor("x_zero_point", TensorProto.FLOAT4E2M1, (1,), [0]) +y = np.array([0, 2, -2, 3, -8], dtype=np.float32) + +expect( + node, + inputs=[x, x_scale, x_zero_point], + outputs=[y], + name="test_dequantizelinear_float4e2m1", +) +``` +
int16 @@ -13937,7 +14017,7 @@ for quant_type_name in ["uint8", "int8"]: ### QuantizeLinear -There are 10 test cases, listed as following: +There are 11 test cases, listed as following:
axis @@ -14151,6 +14231,48 @@ expect( ) ``` +
+
+float4e2m1 + +```python +node = onnx.helper.make_node( + "QuantizeLinear", + inputs=["x", "y_scale", "y_zero_point"], + outputs=["y"], + axis=0, +) + +x = np.array( + [ + [0.0, 2.5, 4.8, 8.6], + [-30, -20, 6, 9], + [-0.0, -2.5, -4.8, -8.6], + ] +).astype(np.float32) + +y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32) +y_zero_point = make_tensor( + "y_zero_point", + TensorProto.FLOAT4E2M1, + y_scale.shape, + np.zeros_like(y_scale), +) +y = make_tensor( + "y", + TensorProto.FLOAT4E2M1, + x.shape, + [0, 1, 2, 4, -6, -6, 2, 3, 0, -0.5, -1, -2], +) + +expect( + node, + inputs=[x, y_scale, y_zero_point], + outputs=[y], + name="test_quantizelinear_float4e2m1", +) +``` +
int16 diff --git a/onnx/backend/test/case/node/cast.py b/onnx/backend/test/case/node/cast.py index efcc7ddf944..9696373920d 100644 --- a/onnx/backend/test/case/node/cast.py +++ b/onnx/backend/test/case/node/cast.py @@ -18,7 +18,11 @@ make_tensor, tensor_dtype_to_field, ) -from onnx.numpy_helper import float8e4m3_to_float32, float8e5m2_to_float32 +from onnx.numpy_helper import ( + float8e4m3_to_float32, + float8e5m2_to_float32, + unpacked_float4e2m1_to_float32, +) class Cast(Base): @@ -62,6 +66,10 @@ def export() -> None: ("INT4", "FLOAT"), ("INT4", "FLOAT16"), ("INT4", "INT8"), + ("FLOAT4E2M1", "FLOAT"), + ("FLOAT4E2M1", "FLOAT16"), + ("FLOAT", "FLOAT4E2M1"), + ("FLOAT16", "FLOAT4E2M1"), ] vect_float32_to_float8e4m3 = np.vectorize(float32_to_float8e4m3) @@ -278,7 +286,57 @@ def export() -> None: output_type_proto = onnx.helper.make_tensor_type_proto( getattr(TensorProto, to_type), input_shape ) + elif from_type == "FLOAT4E2M1" or to_type == "FLOAT4E2M1": + np_fp32 = np.array( + [ + "0.48", + "0.25", + "1.05", + "-3.5", + "-8", + "9", + "1000000", + "1e-7", + "NaN", + "INF", + "+INF", + "-INF", + "-4", + "0.01", + "-0.0", + ], + dtype=np.float32, + ) + input_shape = (3, 5) + if from_type == "FLOAT": + input_values = np_fp32 + input = make_tensor( + "x", TensorProto.FLOAT, input_shape, input_values.tolist() + ) + elif from_type == "FLOAT16": + input_values = np_fp32.astype(np.float16).astype(np.float32) + input = make_tensor( + "x", TensorProto.FLOAT16, input_shape, input_values.tolist() + ) + elif from_type == "FLOAT4E2M1": + input = make_tensor( + "x", TensorProto.FLOAT4E2M1, input_shape, np_fp32.tolist() + ) + else: + raise ValueError( + f"Conversion from {from_type} to {to_type} is not tested." + ) + if to_type not in ("FLOAT", "FLOAT16", "FLOAT4E2M1"): + raise ValueError( + f"Conversion from {from_type} to {to_type} is not tested." + ) + expected = unpacked_float4e2m1_to_float32( + subbyte.float32_to_float4e2m1_unpacked(np_fp32) + ) + output = make_tensor( + "y", getattr(TensorProto, to_type), input_shape, expected.tolist() + ) elif from_type != "STRING": input = np.random.random_sample(shape).astype( helper.tensor_dtype_to_np_dtype(getattr(TensorProto, from_type)) diff --git a/onnx/backend/test/case/node/dequantizelinear.py b/onnx/backend/test/case/node/dequantizelinear.py index cdaf8bdf5b8..6d917449bb6 100644 --- a/onnx/backend/test/case/node/dequantizelinear.py +++ b/onnx/backend/test/case/node/dequantizelinear.py @@ -235,6 +235,28 @@ def export_int4() -> None: name="test_dequantizelinear_int4", ) + @staticmethod + def export_float4e2m1() -> None: + node = onnx.helper.make_node( + "DequantizeLinear", + inputs=["x", "x_scale", "x_zero_point"], + outputs=["y"], + axis=0, + ) + + # scalar zero point and scale + x = make_tensor("x", TensorProto.FLOAT4E2M1, [5], [0, 1, -1, 1.5, -4]) + x_scale = np.float32(2) + x_zero_point = make_tensor("x_zero_point", TensorProto.FLOAT4E2M1, (1,), [0]) + y = np.array([0, 2, -2, 3, -8], dtype=np.float32) + + expect( + node, + inputs=[x, x_scale, x_zero_point], + outputs=[y], + name="test_dequantizelinear_float4e2m1", + ) + @staticmethod def export_blocked() -> None: node = onnx.helper.make_node( diff --git a/onnx/backend/test/case/node/quantizelinear.py b/onnx/backend/test/case/node/quantizelinear.py index fbb2d0693cf..79cf1963c94 100644 --- a/onnx/backend/test/case/node/quantizelinear.py +++ b/onnx/backend/test/case/node/quantizelinear.py @@ -276,6 +276,44 @@ def export_int4() -> None: name="test_quantizelinear_int4", ) + @staticmethod + def export_float4e2m1() -> None: + node = onnx.helper.make_node( + "QuantizeLinear", + inputs=["x", "y_scale", "y_zero_point"], + outputs=["y"], + axis=0, + ) + + x = np.array( + [ + [0.0, 2.5, 4.8, 8.6], + [-30, -20, 6, 9], + [-0.0, -2.5, -4.8, -8.6], + ] + ).astype(np.float32) + + y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32) + y_zero_point = make_tensor( + "y_zero_point", + TensorProto.FLOAT4E2M1, + y_scale.shape, + np.zeros_like(y_scale), + ) + y = make_tensor( + "y", + TensorProto.FLOAT4E2M1, + x.shape, + [0, 1, 2, 4, -6, -6, 2, 3, 0, -0.5, -1, -2], + ) + + expect( + node, + inputs=[x, y_scale, y_zero_point], + outputs=[y], + name="test_quantizelinear_float4e2m1", + ) + @staticmethod def export_blocked_asymmetric() -> None: node = onnx.helper.make_node( diff --git a/onnx/backend/test/data/node/test_cast_BFLOAT16_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_BFLOAT16_to_FLOAT/model.onnx index 7402af390d4..caebcbd248a 100644 Binary files a/onnx/backend/test/data/node/test_cast_BFLOAT16_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_BFLOAT16_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT/model.onnx index 3594e20970a..cda31f82bb5 100644 Binary files a/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT16/model.onnx index db456b92e86..a6b9d6fb14c 100644 Binary files a/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_DOUBLE_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_DOUBLE/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_DOUBLE/model.onnx index 358701489e2..f68f402c54f 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_DOUBLE/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_DOUBLE/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT/model.onnx index cebf493bff6..2da51dfbb59 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/model.onnx new file mode 100644 index 00000000000..27d5957e2f9 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..1a384c7934e --- /dev/null +++ b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/input_0.pb @@ -0,0 +1,2 @@ + +*'ohxBBx \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..8b4a410ccfe --- /dev/null +++ b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT4E2M1/test_data_set_0/output_0.pb @@ -0,0 +1,2 @@ +* +wBy \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FN/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FN/model.onnx index 79c77d5e7a4..fbe9df364c7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FN/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FN/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx index 4374cc235be..a964306416a 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2/model.onnx index 9c184eed0a7..302b275fa33 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx index 64a4c1cf671..ff45041d2e7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_INT4/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_INT4/model.onnx index 5b007d53f17..8df7b40d83d 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_INT4/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_INT4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT16_to_UINT4/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT16_to_UINT4/model.onnx index ea80055f7dd..8e03f69793c 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT16_to_UINT4/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT16_to_UINT4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/model.onnx new file mode 100644 index 00000000000..4bdbba640e5 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..e493e52316a --- /dev/null +++ b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/input_0.pb @@ -0,0 +1,2 @@ +* +wBx \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..25682984fd8 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT/test_data_set_0/output_0.pb differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/model.onnx new file mode 100644 index 00000000000..8e870de7662 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..e493e52316a --- /dev/null +++ b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/input_0.pb @@ -0,0 +1,2 @@ +* +wBx \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..333fbe86675 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT4E2M1_to_FLOAT16/test_data_set_0/output_0.pb differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx index afc96a8f4fc..a2b1b31557d 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT16/model.onnx index 13d37732025..128b8b999d7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FNUZ_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT/model.onnx index 4e1c7b923b6..c043fc223b7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT16/model.onnx index 4a465f2af59..c2eae564154 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E4M3FN_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx index ee198dad01e..5103c888145 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT16/model.onnx index 302d3375996..e2120722408 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2FNUZ_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT/model.onnx index a3e8e4ac337..2ef3c1b6853 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT16/model.onnx index f187fc0ed9d..e4ef5b789a7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT8E5M2_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_BFLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_BFLOAT16/model.onnx index 811b2d84113..f78f2462dab 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_BFLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_BFLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_DOUBLE/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_DOUBLE/model.onnx index dc7997cddd8..6d57123bfab 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_DOUBLE/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_DOUBLE/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT16/model.onnx index a80e603f0d1..bc0a082bfc0 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/model.onnx new file mode 100644 index 00000000000..e3a3cd6cbaa Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..c30e7f3c452 Binary files /dev/null and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/input_0.pb differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..8b4a410ccfe --- /dev/null +++ b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT4E2M1/test_data_set_0/output_0.pb @@ -0,0 +1,2 @@ +* +wBy \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FN/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FN/model.onnx index 6612d1b0206..c85c7aad64c 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FN/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FN/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx index 294293e6154..caf8f5d34b7 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2/model.onnx index f92192a72be..19c3266cc41 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx index 1c7e8748bab..01f5ed6189c 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_INT4/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_INT4/model.onnx index 9798921001e..0e2100cc86e 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_INT4/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_INT4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_STRING/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_STRING/model.onnx index 04fe3fb35e3..0f7727e7bdc 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_STRING/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_STRING/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_FLOAT_to_UINT4/model.onnx b/onnx/backend/test/data/node/test_cast_FLOAT_to_UINT4/model.onnx index 432ba0a4bfc..d07449f667f 100644 Binary files a/onnx/backend/test/data/node/test_cast_FLOAT_to_UINT4/model.onnx and b/onnx/backend/test/data/node/test_cast_FLOAT_to_UINT4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT/model.onnx index 8449c1737f0..e08ed20c2e3 100644 Binary files a/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT16/model.onnx index f7f581c3d9a..542b8e90e2c 100644 Binary files a/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_INT4_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_INT4_to_INT8/model.onnx b/onnx/backend/test/data/node/test_cast_INT4_to_INT8/model.onnx index ecfafda26b3..0ce8722b60b 100644 Binary files a/onnx/backend/test/data/node/test_cast_INT4_to_INT8/model.onnx and b/onnx/backend/test/data/node/test_cast_INT4_to_INT8/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_STRING_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_STRING_to_FLOAT/model.onnx index bc088fce757..ca49d476a9c 100644 Binary files a/onnx/backend/test/data/node/test_cast_STRING_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_STRING_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT/model.onnx index 991b21832ea..6ec793b8a4f 100644 Binary files a/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT16/model.onnx index 9edb9126318..dd4f409b340 100644 Binary files a/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_cast_UINT4_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_UINT4_to_UINT8/model.onnx b/onnx/backend/test/data/node/test_cast_UINT4_to_UINT8/model.onnx index ff3edd591ce..f2d61ab673c 100644 Binary files a/onnx/backend/test/data/node/test_cast_UINT4_to_UINT8/model.onnx and b/onnx/backend/test/data/node/test_cast_UINT4_to_UINT8/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FN/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FN/model.onnx index 5f6fa0874c3..ecf92ce8691 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FN/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FN/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx index 6bddc0fce58..64b97499e91 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E4M3FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2/model.onnx index 2de339e0b9f..48727e0ed1d 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx index 6784a0cea84..b0d93efc40b 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT16_to_FLOAT8E5M2FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FN/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FN/model.onnx index 85d6893319e..f3e57ee5189 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FN/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FN/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx index 6a9040e8dd8..637e59627fc 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2/model.onnx index bbb523b53c9..d5338daa8ee 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx index 2949e4fecf7..049585694ad 100644 Binary files a/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx and b/onnx/backend/test/data/node/test_cast_no_saturate_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT/model.onnx index e317ab8b07c..5db6e3d281d 100644 Binary files a/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT_expanded/model.onnx index ba46944dda9..33389f328d9 100644 Binary files a/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_BFLOAT16_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT/model.onnx index c22a921f46f..013bdc6abf3 100644 Binary files a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16/model.onnx index 54497577cbe..205e430fe95 100644 Binary files a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16_expanded/model.onnx index 402f6db2b1a..124adb65d18 100644 Binary files a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT16_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT_expanded/model.onnx index 59650e11234..5ff61d77bb4 100644 Binary files a/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_DOUBLE_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE/model.onnx index d58627cf3b4..22cd04e8bc3 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE_expanded/model.onnx index a734e2346e6..97062b11ab6 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_DOUBLE_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT/model.onnx index 88caac7e985..1f9cefac0fb 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT_expanded/model.onnx index b32abb1e770..43a0dfc6e33 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT16_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx index ce19c577d5c..5b6c1d318fc 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT_expanded/model.onnx index 41340017e82..84f29cb3c3a 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FNUZ_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT/model.onnx index e7e454dcda0..4b7ce8654da 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT_expanded/model.onnx index df41bc33ed5..0168f57985d 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E4M3FN_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx index 8d28ca1ad49..2b4af760968 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT_expanded/model.onnx index e65bcf05676..c5cc28cfafa 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2FNUZ_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT/model.onnx index 5d3e9dc5a9b..bae9ae46c58 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT_expanded/model.onnx index 2c267d260cc..a93e3a08388 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT8E5M2_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16/model.onnx index ab692876e69..5055948da64 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16_expanded/model.onnx index 38fcc29ec90..25536e1eded 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_BFLOAT16_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE/model.onnx index 5621a9fd4f4..40051c68b35 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE_expanded/model.onnx index c87485cf41d..6cd67b2b959 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_DOUBLE_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16/model.onnx index eac88033128..45a20c797a2 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16_expanded/model.onnx index 2d810d9dce2..e2ebfe340ad 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT16_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN/model.onnx index 63ceb525138..496d8c0aca0 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx index 931961e2a98..26c14094ff0 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ_expanded/model.onnx index 7cbdb218282..b1a09400222 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FNUZ_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN_expanded/model.onnx index 1479a635e37..d6bc622a834 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E4M3FN_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2/model.onnx index f133decc26c..e8ad6f9b7c1 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx index 2b08ec039ec..967988ea83f 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ_expanded/model.onnx index 21caa48f0f7..5c42c2543ac 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2FNUZ_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2_expanded/model.onnx index ac95dc3eda3..7d95ccc230a 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_FLOAT8E5M2_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING/model.onnx index 4d1f6be991d..4df795fc36c 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING_expanded/model.onnx index 7a0ae59b49e..8974ca55a77 100644 Binary files a/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_FLOAT_to_STRING_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT/model.onnx b/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT/model.onnx index d574b950c05..f3b7f6505c7 100644 Binary files a/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT/model.onnx and b/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT/model.onnx differ diff --git a/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT_expanded/model.onnx b/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT_expanded/model.onnx index a2d99d3a8cd..dedd9577858 100644 Binary files a/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT_expanded/model.onnx and b/onnx/backend/test/data/node/test_castlike_STRING_to_FLOAT_expanded/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear/model.onnx index 290c586624e..2e4ee2da8b2 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_axis/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_axis/model.onnx index d595d79e2e8..228da4dfc39 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_axis/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_axis/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_blocked/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_blocked/model.onnx index a92e8bf607e..b334509aacf 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_blocked/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_blocked/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn/model.onnx index 128b8d1747f..0a01ef513de 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_float16/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_float16/model.onnx index 0b6ce1fcb8e..67432330599 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_float16/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_float16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_zero_point/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_zero_point/model.onnx index e92fa6e0e6a..88569cfdf59 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_zero_point/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_e4m3fn_zero_point/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_e5m2/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_e5m2/model.onnx index b81cfaf5301..86a5b045edf 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_e5m2/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_e5m2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/model.onnx new file mode 100644 index 00000000000..9957cc0c524 Binary files /dev/null and b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..41e46fec0bc --- /dev/null +++ b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_0.pb @@ -0,0 +1 @@ +* :Bx \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_1.pb new file mode 100644 index 00000000000..d0d64800429 Binary files /dev/null and b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_1.pb differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..e5a43e2f5c9 Binary files /dev/null and b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/input_2.pb differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..df2e8bac210 Binary files /dev/null and b/onnx/backend/test/data/node/test_dequantizelinear_float4e2m1/test_data_set_0/output_0.pb differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_int16/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_int16/model.onnx index bc40cc7d0d6..3c2235fb0bc 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_int16/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_int16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_int4/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_int4/model.onnx index 353ffe9f3fe..c2cc2215b99 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_int4/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_int4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_uint16/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_uint16/model.onnx index f42ab9f657c..3c52ad78d10 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_uint16/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_uint16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_dequantizelinear_uint4/model.onnx b/onnx/backend/test/data/node/test_dequantizelinear_uint4/model.onnx index 658cc4ecdd9..254a1c124b0 100644 Binary files a/onnx/backend/test/data/node/test_dequantizelinear_uint4/model.onnx and b/onnx/backend/test/data/node/test_dequantizelinear_uint4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear/model.onnx b/onnx/backend/test/data/node/test_quantizelinear/model.onnx index 5c25d26c940..ae860ca0f90 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_axis/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_axis/model.onnx index cba37b4c5b7..fd0419cff90 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_axis/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_axis/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_blocked_asymmetric/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_blocked_asymmetric/model.onnx index 8adafd0dcd2..287437d3972 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_blocked_asymmetric/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_blocked_asymmetric/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_blocked_symmetric/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_blocked_symmetric/model.onnx index 4afdf6cacfe..2121744ef2e 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_blocked_symmetric/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_blocked_symmetric/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_e4m3fn/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_e4m3fn/model.onnx index 448b6fea940..066c5b94b8a 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_e4m3fn/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_e4m3fn/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_e5m2/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_e5m2/model.onnx index 670ec3343e4..3b82f6a1299 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_e5m2/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_e5m2/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/model.onnx new file mode 100644 index 00000000000..dfecbdbf44f Binary files /dev/null and b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_0.pb b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_0.pb new file mode 100644 index 00000000000..29e2d756015 Binary files /dev/null and b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_0.pb differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_1.pb b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_1.pb new file mode 100644 index 00000000000..c4cda36fa82 Binary files /dev/null and b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_1.pb differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_2.pb b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_2.pb new file mode 100644 index 00000000000..dca288f7b12 Binary files /dev/null and b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/input_2.pb differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/output_0.pb b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/output_0.pb new file mode 100644 index 00000000000..ad7efc327c2 --- /dev/null +++ b/onnx/backend/test/data/node/test_quantizelinear_float4e2m1/test_data_set_0/output_0.pb @@ -0,0 +1 @@ +* dTBy \ No newline at end of file diff --git a/onnx/backend/test/data/node/test_quantizelinear_int16/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_int16/model.onnx index 640b17df488..4ed02fdfa7e 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_int16/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_int16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_int4/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_int4/model.onnx index e55587153f5..e80a1915509 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_int4/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_int4/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_uint16/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_uint16/model.onnx index 2dd43b2f547..9c78767c029 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_uint16/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_uint16/model.onnx differ diff --git a/onnx/backend/test/data/node/test_quantizelinear_uint4/model.onnx b/onnx/backend/test/data/node/test_quantizelinear_uint4/model.onnx index a2c0062f272..045c5f32f71 100644 Binary files a/onnx/backend/test/data/node/test_quantizelinear_uint4/model.onnx and b/onnx/backend/test/data/node/test_quantizelinear_uint4/model.onnx differ diff --git a/onnx/defs/operator_sets.h b/onnx/defs/operator_sets.h index de19cf1f32f..34aaf81598d 100644 --- a/onnx/defs/operator_sets.h +++ b/onnx/defs/operator_sets.h @@ -1291,11 +1291,18 @@ class OpSet_Onnx_ver22 { }; // Iterate over schema from ai.onnx version 23 +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 23, Cast); +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 23, CastLike); +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 23, DequantizeLinear); +class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 23, QuantizeLinear); + class OpSet_Onnx_ver23 { public: static void ForEachSchema(std::function fn) { - // TODO: Remove after introducing the first schema to opset 23 - (void)fn; + fn(GetOpSchema()); + fn(GetOpSchema()); + fn(GetOpSchema()); + fn(GetOpSchema()); } }; diff --git a/onnx/defs/quantization/defs.cc b/onnx/defs/quantization/defs.cc index a68e076b3dd..e5938e7a4b2 100644 --- a/onnx/defs/quantization/defs.cc +++ b/onnx/defs/quantization/defs.cc @@ -37,7 +37,7 @@ In all cases, `y_zero_point` must have the same shape as `y_scale`. ONNX_OPERATOR_SET_SCHEMA( QuantizeLinear, - 21, + 23, OpSchema() .Input(0, "x", "N-D full precision Input tensor to be quantized.", "T1") .Input( @@ -101,7 +101,8 @@ ONNX_OPERATOR_SET_SCHEMA( "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", - "tensor(int4)"}, + "tensor(int4)", + "tensor(float4e2m1)"}, "The type of the input `y_zero_point` and the output `y`.") .SetDoc(QuantizeLinear_ver21_doc) .TypeAndShapeInferenceFunction([](ONNX_NAMESPACE::InferenceContext& ctx) { @@ -147,7 +148,7 @@ for consistency, and `x_scale` still determines the output type. ONNX_OPERATOR_SET_SCHEMA( DequantizeLinear, - 21, + 23, OpSchema() .Input(0, "x", "N-D quantized input tensor to be de-quantized.", "T1") .Input( @@ -192,7 +193,8 @@ ONNX_OPERATOR_SET_SCHEMA( "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", - "tensor(int4)"}, + "tensor(int4)", + "tensor(float4e2m1)"}, "The type of the inputs 'x_zero_point' and 'x'.") .TypeConstraint( "T2", diff --git a/onnx/defs/quantization/old.cc b/onnx/defs/quantization/old.cc index 12be8ad03cb..a20c2da3eb8 100644 --- a/onnx/defs/quantization/old.cc +++ b/onnx/defs/quantization/old.cc @@ -7,6 +7,202 @@ namespace ONNX_NAMESPACE { +static const char* QuantizeLinear_ver21_doc = R"DOC( +The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the +low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization +granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`. +Saturation is done according to: +- uint16: [0, 65535] +- int16: [-32768, 32767] +- uint8: [0, 255] +- int8: [-128, 127] +- uint4: [0, 15] +- int4: [-8, 7] +For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details. +`y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization +formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type. +There are three supported quantization granularities, determined by the shape of `y_scale`. +In all cases, `y_zero_point` must have the same shape as `y_scale`. +- Per-tensor (per-layer) quantization: `y_scale` is a scalar. +- Per-axis quantization: The scale must be a 1-D tensor, with the length of the quantization axis. For an input shape + `(D0, ..., Di, ..., Dn)` and `axis=i`, `y_scale` is a 1-D tensor of length `Di`. +- Blocked quantization: The scale's shape is identical to the input's shape, except for one dimension, in which + blocking is performed. Given `x` shape `(D0, ..., Di, ..., Dn)`, `axis=i`, and block size `B`: `y_scale` shape is + `(D0, ..., ceil(Di/B), ..., Dn)`. +)DOC"; + +ONNX_OPERATOR_SET_SCHEMA( + QuantizeLinear, + 21, + OpSchema() + .Input(0, "x", "N-D full precision Input tensor to be quantized.", "T1") + .Input( + 1, + "y_scale", + "Scale for doing quantization to get `y`. For per-tensor/layer quantization the scale is a scalar, for " + "per-axis quantization it is a 1-D Tensor and for blocked quantization it has the same shape as the " + "input, except for one dimension in which blocking is performed.", + "T1") + .Input( + 2, + "y_zero_point", + "Zero point for doing quantization to get `y`. Shape must match `y_scale`." + "Default is uint8 with zero point of 0 if it's not specified.", + "T2", + OpSchema::Optional) + .Output(0, "y", "N-D quantized output tensor. It has same shape as input `x`.", "T2") + .Attr( + "axis", + "(Optional) The axis of the dequantizing dimension of the input tensor. Used only for per-axis and blocked " + "quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` " + "where `r = rank(input)`. When the rank of the input is 1, per-tensor quantization is applied, " + "rendering the axis unnecessary in this scenario.", + AttributeProto::INT, + static_cast(1)) + .Attr( + "saturate", + "The parameter defines how the conversion behaves if an input value is out of " + "range of the destination type. It only applies for float 8 quantization " + "(float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. " + "All cases are fully described in two tables inserted in the operator description.", + AttributeProto::INT, + static_cast(1)) + .Attr( + "block_size", + "(Optional) The size of the quantization block (number of times every scale is replicated). Used only for " + "blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, " + "`y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is " + "`[ceil(Di/Si), ceil(Di/(Si-1))-1]`", + AttributeProto::INT, + static_cast(0)) + .Attr( + "output_dtype", + "(Optional) The output data type. If not supplied, the output data type is inferred from `y_zero_point` data type (`T2`). " + "If neither `output_dtype` nor `y_zero_point` are supplied, output data type is uint8. " + "If both `output_dtype` and `y_zero_point` are specified, `output_dtype` must be `T2`.", + AttributeProto::INT, + static_cast(0)) + .TypeConstraint( + "T1", + {"tensor(float)", "tensor(float16)", "tensor(bfloat16)", "tensor(int32)"}, + "The type of the input 'x'.") + .TypeConstraint( + "T2", + {"tensor(int8)", + "tensor(uint8)", + "tensor(int16)", + "tensor(uint16)", + "tensor(float8e4m3fn)", + "tensor(float8e4m3fnuz)", + "tensor(float8e5m2)", + "tensor(float8e5m2fnuz)", + "tensor(uint4)", + "tensor(int4)"}, + "The type of the input `y_zero_point` and the output `y`.") + .SetDoc(QuantizeLinear_ver21_doc) + .TypeAndShapeInferenceFunction([](ONNX_NAMESPACE::InferenceContext& ctx) { + auto const zp_type = ctx.hasInput(2) ? ctx.getInputType(2) : nullptr; + auto const output_dtype = + static_cast(getAttribute(ctx, "output_dtype", TensorProto::UNDEFINED)); + if (zp_type != nullptr) { + auto const zp_elem_type = static_cast(getTensorElementType(*zp_type)); + if (output_dtype != TensorProto::UNDEFINED && output_dtype != zp_elem_type) { + fail_type_inference( + "output_dtype ", + TensorProto_DataType_Name(output_dtype), + " does not match y_zero_point type ", + TensorProto_DataType_Name(zp_elem_type), + "."); + } + propagateElemTypeFromInputToOutput(ctx, 2, 0); + } else if (output_dtype != TensorProto::UNDEFINED) { + propagateElemTypeFromAttributeToOutput(ctx, "output_dtype", 0); + } else { + updateOutputElemType(ctx, 0, TensorProto::UINT8); + } + if (!hasInputShape(ctx, 0)) { + return; + } + + auto& input_shape = getInputShape(ctx, 0); + updateOutputShape(ctx, 0, input_shape); + })); + +static const char* DequantizeLinear_ver21_doc = R"DOC( +The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the +full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` +must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization, +a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization. +See QuantizeLinear for details on quantization granularity. +`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing +`int32`, there's no zero point (zero point is supposed to be 0). +`zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same +for consistency, and `x_scale` still determines the output type. +)DOC"; + +ONNX_OPERATOR_SET_SCHEMA( + DequantizeLinear, + 21, + OpSchema() + .Input(0, "x", "N-D quantized input tensor to be de-quantized.", "T1") + .Input( + 1, + "x_scale", + "Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for " + "per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as " + "the input, except for one dimension in which blocking is performed.", + "T2") + .Input( + 2, + "x_zero_point", + "Zero point for input `x`. Shape must match x_scale. " + "It's optional. Zero point is 0 when it's not specified.", + "T1", + OpSchema::Optional) + .Output(0, "y", "N-D full precision output tensor. It has same shape as input `x`.", "T2") + .Attr( + "axis", + "(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked " + "quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` " + "where `r = rank(input)`.", + AttributeProto::INT, + static_cast(1)) + .Attr( + "block_size", + "(Optional) The size of the quantization block (number of times every scale is replicated). Used only for " + "blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, " + "`y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is " + "`[ceil(Di/Si), ceil(Di/(Si-1))-1]`", + AttributeProto::INT, + static_cast(0)) + .TypeConstraint( + "T1", + {"tensor(int8)", + "tensor(uint8)", + "tensor(int16)", + "tensor(uint16)", + "tensor(int32)", + "tensor(float8e4m3fn)", + "tensor(float8e4m3fnuz)", + "tensor(float8e5m2)", + "tensor(float8e5m2fnuz)", + "tensor(uint4)", + "tensor(int4)"}, + "The type of the inputs 'x_zero_point' and 'x'.") + .TypeConstraint( + "T2", + {"tensor(float)", "tensor(float16)", "tensor(bfloat16)"}, + "'x_scale' determines the output type.") + .SetDoc(DequantizeLinear_ver21_doc) + .TypeAndShapeInferenceFunction([](ONNX_NAMESPACE::InferenceContext& ctx) { + propagateElemTypeFromInputToOutput(ctx, 1, 0); + if (!hasInputShape(ctx, 0)) { + return; + } + auto& input_shape = getInputShape(ctx, 0); + updateOutputShape(ctx, 0, input_shape); + })); + static const char* QuantizeLinear_ver19_doc = R"DOC( The linear quantization operator. It consumes a high precision tensor, a scale, and a zero point to compute the low precision / quantized tensor. The scale factor and zero point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. diff --git a/onnx/defs/tensor/defs.cc b/onnx/defs/tensor/defs.cc index c27620db5ee..84f72fd907e 100644 --- a/onnx/defs/tensor/defs.cc +++ b/onnx/defs/tensor/defs.cc @@ -83,7 +83,7 @@ The rules then become: ONNX_OPERATOR_SET_SCHEMA( Cast, - 21, + 23, OpSchema() .SetDoc(Cast_ver19_doc) .Attr( @@ -112,19 +112,11 @@ ONNX_OPERATOR_SET_SCHEMA( OpSchema::Differentiable) .TypeConstraint( "T1", - {"tensor(float16)", "tensor(float)", "tensor(double)", "tensor(int8)", - "tensor(int16)", "tensor(int32)", "tensor(int64)", "tensor(uint8)", - "tensor(uint16)", "tensor(uint32)", "tensor(uint64)", "tensor(bool)", - "tensor(string)", "tensor(bfloat16)", "tensor(float8e4m3fn)", "tensor(float8e4m3fnuz)", - "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", "tensor(int4)"}, + OpSchema::all_non_complex_tensor_types_ir11(), "Constrain input types. Casting from complex is not supported.") .TypeConstraint( "T2", - {"tensor(float16)", "tensor(float)", "tensor(double)", "tensor(int8)", - "tensor(int16)", "tensor(int32)", "tensor(int64)", "tensor(uint8)", - "tensor(uint16)", "tensor(uint32)", "tensor(uint64)", "tensor(bool)", - "tensor(string)", "tensor(bfloat16)", "tensor(float8e4m3fn)", "tensor(float8e4m3fnuz)", - "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", "tensor(int4)"}, + OpSchema::all_non_complex_tensor_types_ir11(), "Constrain output types. Casting to complex is not supported.") .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromAttributeToOutput(ctx, "to", 0); @@ -144,7 +136,7 @@ See documentation of the Cast operator for further details. ONNX_OPERATOR_SET_SCHEMA( CastLike, - 21, + 23, OpSchema() .SetDoc(CastLike_ver21_doc) .Attr( @@ -176,11 +168,11 @@ ONNX_OPERATOR_SET_SCHEMA( OpSchema::Differentiable) .TypeConstraint( "T1", - OpSchema::all_non_complex_tensor_types_ir10(), + OpSchema::all_non_complex_tensor_types_ir11(), "Constrain input types. Casting from complex is not supported.") .TypeConstraint( "T2", - OpSchema::all_non_complex_tensor_types_ir10(), + OpSchema::all_non_complex_tensor_types_ir11(), "Constrain output types. Casting to complex is not supported.") .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { propagateElemTypeFromInputToOutput(ctx, 1, 0); diff --git a/onnx/defs/tensor/old.cc b/onnx/defs/tensor/old.cc index 78d47ab0c3a..a7daf41a145 100644 --- a/onnx/defs/tensor/old.cc +++ b/onnx/defs/tensor/old.cc @@ -178,6 +178,61 @@ The rules then become: | else | RNE | RNE | RNE | RNE | )DOC"; +ONNX_OPERATOR_SET_SCHEMA( + Cast, + 21, + OpSchema() + .SetDoc(Cast_ver19_doc) + .Attr( + "to", + "The data type to which the elements of the input tensor are cast. " + "Strictly must be one of the types from DataType enum in TensorProto", + AttributeProto::INT) + .Attr( + "saturate", + "The parameter defines how the conversion behaves if an input value is out of " + "range of the destination type. It only applies for float 8 conversion " + "(float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. " + "All cases are fully described in two tables inserted in the operator description.", + AttributeProto::INT, + static_cast(1)) + .Input(0, "input", "Input tensor to be cast.", "T1", OpSchema::Single, true, 1, OpSchema::Differentiable) + .Output( + 0, + "output", + "Output tensor with the same shape as input with type " + "specified by the 'to' argument", + "T2", + OpSchema::Single, + true, + 1, + OpSchema::Differentiable) + .TypeConstraint( + "T1", + {"tensor(float16)", "tensor(float)", "tensor(double)", "tensor(int8)", + "tensor(int16)", "tensor(int32)", "tensor(int64)", "tensor(uint8)", + "tensor(uint16)", "tensor(uint32)", "tensor(uint64)", "tensor(bool)", + "tensor(string)", "tensor(bfloat16)", "tensor(float8e4m3fn)", "tensor(float8e4m3fnuz)", + "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", "tensor(int4)"}, + "Constrain input types. Casting from complex is not supported.") + .TypeConstraint( + "T2", + {"tensor(float16)", "tensor(float)", "tensor(double)", "tensor(int8)", + "tensor(int16)", "tensor(int32)", "tensor(int64)", "tensor(uint8)", + "tensor(uint16)", "tensor(uint32)", "tensor(uint64)", "tensor(bool)", + "tensor(string)", "tensor(bfloat16)", "tensor(float8e4m3fn)", "tensor(float8e4m3fnuz)", + "tensor(float8e5m2)", "tensor(float8e5m2fnuz)", "tensor(uint4)", "tensor(int4)"}, + "Constrain output types. Casting to complex is not supported.") + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { + propagateElemTypeFromAttributeToOutput(ctx, "to", 0); + if (hasNInputShapes(ctx, 1)) { + propagateShapeFromInputToOutput(ctx, 0, 0); + } + }) + .PartialDataPropagationFunction([](DataPropagationContext& ctx) { + PropagateShapeDataFromInputToOutput(ctx, 0); + })); + ONNX_OPERATOR_SET_SCHEMA( Cast, 19, @@ -362,6 +417,74 @@ ONNX_OPERATOR_SET_SCHEMA( PropagateShapeDataFromInputToOutput(ctx, 0); })); +static const char* CastLike_ver21_doc = R"DOC( +The operator casts the elements of a given input tensor (the first input) to +the same data type as the elements of the second input tensor. +See documentation of the Cast operator for further details. +)DOC"; + +ONNX_OPERATOR_SET_SCHEMA( + CastLike, + 21, + OpSchema() + .SetDoc(CastLike_ver21_doc) + .Attr( + "saturate", + "The parameter defines how the conversion behaves if an input value is out of " + "range of the destination type. It only applies for float 8 conversion " + "(float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. " + "Please refer to operator Cast description for further details.", + AttributeProto::INT, + static_cast(1)) + .Input(0, "input", "Input tensor to be cast.", "T1", OpSchema::Single, true, 1, OpSchema::Differentiable) + .Input( + 1, + "target_type", + "The (first) input tensor will be cast to produce a tensor of the same type as this (second input) tensor.", + "T2", + OpSchema::Single, + true, + 1, + OpSchema::NonDifferentiable) + .Output( + 0, + "output", + "Output tensor produced by casting the first input tensor to have the same type as the second input tensor.", + "T2", + OpSchema::Single, + true, + 1, + OpSchema::Differentiable) + .TypeConstraint( + "T1", + OpSchema::all_non_complex_tensor_types_ir10(), + "Constrain input types. Casting from complex is not supported.") + .TypeConstraint( + "T2", + OpSchema::all_non_complex_tensor_types_ir10(), + "Constrain output types. Casting to complex is not supported.") + .TypeAndShapeInferenceFunction([](InferenceContext& ctx) { + propagateElemTypeFromInputToOutput(ctx, 1, 0); + if (hasNInputShapes(ctx, 1)) { + propagateShapeFromInputToOutput(ctx, 0, 0); + } + }) + .SetContextDependentFunctionBodyBuilder( + [](const FunctionBodyBuildContext& ctx, const OpSchema& schema, FunctionProto& functionProto) -> bool { + auto target_type = ctx.getInputType(1); + if ((target_type == nullptr) || (!target_type->has_tensor_type())) { + // we cannot create a correct function body without knowing the target element type + return false; + } + auto target_elt_type = target_type->tensor_type().elem_type(); + FunctionBuilder builder(functionProto); + builder.Add( + MakeString("output = Cast (input)") + .c_str()); + schema.BuildFunction(functionProto); + return true; + })); + static const char* CastLike_ver19_doc = R"DOC( The operator casts the elements of a given input tensor (the first input) to the same data type as the elements of the second input tensor. diff --git a/onnx/reference/ops/op_cast_like.py b/onnx/reference/ops/op_cast_like.py index 7604c51c32e..e8df4f988bc 100644 --- a/onnx/reference/ops/op_cast_like.py +++ b/onnx/reference/ops/op_cast_like.py @@ -9,6 +9,7 @@ from onnx.reference.ops.op_cast import ( bfloat16, cast_to, + float4e2m1, float8e4m3fn, float8e4m3fnuz, float8e5m2, @@ -34,6 +35,8 @@ def _cast_like(x, y, saturate): to = TensorProto.UINT4 elif y.dtype == int4 and y.dtype.descr[0][0] == "int4": to = TensorProto.INT4 + elif y.dtype == float4e2m1 and y.dtype.descr[0][0] == "float4e2m1": + to = TensorProto.FLOAT4E2M1 else: to = np_dtype_to_tensor_dtype(y.dtype) # type: ignore return (cast_to(x, to, saturate),) diff --git a/onnx/reference/ops/op_constant.py b/onnx/reference/ops/op_constant.py index e31713b1eed..5586c0ffc88 100644 --- a/onnx/reference/ops/op_constant.py +++ b/onnx/reference/ops/op_constant.py @@ -7,6 +7,7 @@ from onnx._custom_element_types import ( bfloat16, + float4e2m1, float8e4m3fn, float8e4m3fnuz, float8e5m2, @@ -27,6 +28,7 @@ def _check_dtype(val): # type: ignore float8e5m2fnuz, uint4, int4, + float4e2m1, np.int8, np.uint8, np.float16, diff --git a/onnx/reference/ops/op_dequantize_linear.py b/onnx/reference/ops/op_dequantize_linear.py index 83920208776..d9d89d1f210 100644 --- a/onnx/reference/ops/op_dequantize_linear.py +++ b/onnx/reference/ops/op_dequantize_linear.py @@ -7,6 +7,7 @@ from onnx import TensorProto from onnx._custom_element_types import ( + float4e2m1, float8e4m3fn, float8e4m3fnuz, float8e5m2, @@ -15,7 +16,11 @@ uint4, ) from onnx.helper import np_dtype_to_tensor_dtype -from onnx.numpy_helper import float8e4m3_to_float32, float8e5m2_to_float32 +from onnx.numpy_helper import ( + float8e4m3_to_float32, + float8e5m2_to_float32, + unpacked_float4e2m1_to_float32, +) from onnx.reference.op_run import OpRun from onnx.reference.ops.op_quantize_linear import reshape_input @@ -35,6 +40,8 @@ def get_x_type(self, x: np.ndarray) -> int: tensor_dtype = TensorProto.UINT4 elif x.dtype == int4 and x.dtype.descr[0][0] == "int4": tensor_dtype = TensorProto.INT4 + elif x.dtype == float4e2m1 and x.dtype.descr[0][0] == "float4e2m1": + tensor_dtype = TensorProto.FLOAT4E2M1 else: tensor_dtype = np_dtype_to_tensor_dtype(x.dtype) return tensor_dtype @@ -54,7 +61,11 @@ def _run( TensorProto.FLOAT8E5M2, TensorProto.FLOAT8E5M2FNUZ, } - if x_zero_point is not None and not fp8_type: + if ( + x_zero_point is not None + and not fp8_type + and x_type != TensorProto.FLOAT4E2M1 + ): zero_type = self.get_x_type(x_zero_point) if x_type != zero_type: raise ValueError( @@ -81,6 +92,8 @@ def _run( dx = float8e5m2_to_float32(x) elif x_type == TensorProto.FLOAT8E5M2FNUZ: dx = float8e5m2_to_float32(x, fn=True, uz=True) + elif x_type == TensorProto.FLOAT4E2M1: + dx = unpacked_float4e2m1_to_float32(x) else: dx = x.astype(np.float32) y = dx * reshape_input(x_scale, x.shape, axis, block_size) diff --git a/onnx/reference/ops/op_quantize_linear.py b/onnx/reference/ops/op_quantize_linear.py index ddc04c2a13e..d5f2fe7a63b 100644 --- a/onnx/reference/ops/op_quantize_linear.py +++ b/onnx/reference/ops/op_quantize_linear.py @@ -11,6 +11,7 @@ from onnx import TensorProto, subbyte from onnx._custom_element_types import ( + float4e2m1, float8e4m3fn, float8e4m3fnuz, float8e5m2, @@ -97,6 +98,7 @@ class _CommonQuantizeLinear(OpRun): TensorProto.FLOAT8E4M3FNUZ, TensorProto.FLOAT8E5M2, TensorProto.FLOAT8E5M2FNUZ, + TensorProto.FLOAT4E2M1, ) def get_zero_point_type(self, zero_point: np.ndarray) -> int: @@ -122,6 +124,11 @@ def get_zero_point_type(self, zero_point: np.ndarray) -> int: zero_point_type = TensorProto.UINT4 elif zero_point.dtype == int4 and zero_point.dtype.descr[0][0] == "int4": zero_point_type = TensorProto.INT4 + elif ( + zero_point.dtype == float4e2m1 + and zero_point.dtype.descr[0][0] == "float4e2m1" + ): + zero_point_type = TensorProto.FLOAT4E2M1 else: zero_point_type = np_dtype_to_tensor_dtype(zero_point.dtype) return zero_point_type @@ -199,6 +206,11 @@ def _run( i4 = func(xi) return (i4,) # type: ignore[attr-defined] + if tensor_type == TensorProto.FLOAT4E2M1: + x += zero_point + f4 = subbyte.float32_to_float4e2m1_unpacked(x) + return (f4,) # type: ignore[attr-defined] + raise ValueError( f"Unexpected type: output_dtype={tensor_type} is not a supported quantized type." ) diff --git a/onnx/version_converter/convert.h b/onnx/version_converter/convert.h index c3024ab7564..84b7c809ad0 100644 --- a/onnx/version_converter/convert.h +++ b/onnx/version_converter/convert.h @@ -776,6 +776,21 @@ class DefaultVersionConverter : public BaseVersionConverter { registerAdapter(std::make_unique("GRU", OpSetID(22), OpSetID(21), bfloat16_not_allowed)); registerAdapter(std::make_unique("LSTM", OpSetID(22), OpSetID(21), bfloat16_not_allowed)); registerAdapter(std::make_unique("GridSample", OpSetID(22), OpSetID(21), bfloat16_not_allowed)); + + /******** 22 -> 23 ********/ + registerAdapter(std::make_unique("Cast", OpSetID(22), OpSetID(23))); + registerAdapter(std::make_unique("CastLike", OpSetID(22), OpSetID(23))); + registerAdapter(std::make_unique("DequantizeLinear", OpSetID(22), OpSetID(23))); + registerAdapter(std::make_unique("QuantizeLinear", OpSetID(22), OpSetID(23))); + + /******** 23 -> 22 ********/ + const std::vector ir11_types_not_in_ir10 = {TensorProto_DataType_FLOAT4E2M1}; + registerAdapter(std::make_unique("Cast", OpSetID(23), OpSetID(22), ir11_types_not_in_ir10)); + registerAdapter(std::make_unique("CastLike", OpSetID(23), OpSetID(22), ir11_types_not_in_ir10)); + registerAdapter( + std::make_unique("DequantizeLinear", OpSetID(23), OpSetID(22), ir11_types_not_in_ir10)); + registerAdapter( + std::make_unique("QuantizeLinear", OpSetID(23), OpSetID(22), ir11_types_not_in_ir10)); } ModelProto convert_version(const ModelProto& mp_in, const OpSetID& initial_version, const OpSetID& target_version)