-
Notifications
You must be signed in to change notification settings - Fork 55
Issues: NVIDIA/TensorRT-Model-Optimizer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
sm_100 not defined for option gpu-name when running calibration in DeepSeek
#144
opened Mar 4, 2025 by
imenselmi
Restore functionality: lm_head option to disable quantization
#138
opened Feb 20, 2025 by
michaelfeil
More modes for model opt quantization than halving the batch size
#133
opened Feb 18, 2025 by
michaelfeil
Quantization Benchmark on different model architectures -- particularly MHA
#120
opened Jan 3, 2025 by
YixuanSeanZhou
[ONNX][PTQ] Quantization failed with --dq_only flag in ConvTranspose
#117
opened Dec 19, 2024 by
ry3s
What is difference of torch.quantization and onnx.quantization for speed and accuracy ?
#114
opened Dec 11, 2024 by
demuxin
Is there a plan to support more recent PTQ methods for INT8 ViT?
#113
opened Dec 10, 2024 by
dedoogong
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-04.