Pytorch w8a8 SalsaNext quantsim model
quic-bharathr
released this
12 Feb 07:50
·
7 commits
to develop
since this release
Optimized w8a8 checkpoint, encoding and FP32 checkpoint for Pytorch SalsaNext model.
For w8a8 optimization:
- Batch norm folding followed by AdaRound in per channel mode have been applied on the original FP32 model.
- Percentile was used in per channel mode for quantsim.
- one operator activation output is enabled with 16-bit width.