INT8 accuracy drops to zero with explicit quantization #126

sriram487 · 2025-02-06T07:28:05Z

I am currently using the ONNX_PTQ pipeline to generate an int8 quantized ONNX model, and the int8 quantized ONNX model performs well. However, when I convert the quantized ONNX model into a .engine file, the accuracy drops to zero. I've tried both the --best and --int8 flags, but the results are the same in both cases.

Originally posted by @sriram487 in #5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INT8 accuracy drops to zero with explicit quantization #126

INT8 accuracy drops to zero with explicit quantization #126

sriram487 commented Feb 6, 2025

INT8 accuracy drops to zero with explicit quantization #126

INT8 accuracy drops to zero with explicit quantization #126

Comments

sriram487 commented Feb 6, 2025