NVIDIA / TensorRT-Model-Optimizer Public

Notifications
Fork 55
Star 760

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Issues: NVIDIA/TensorRT-Model-Optimizer

[RFC] TensorRT Model Optimizer - Product Roadmap

#108 opened Nov 21, 2024 by hchings

Open 6

Labels 7 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

76 Open 55 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

sm_100 not defined for option gpu-name when running calibration in DeepSeek

#144 opened Mar 4, 2025 by imenselmi

Quantized model size is larger than the original model

#143 opened Mar 1, 2025 by Urania880519

Modelopt-v0.23.2 not support Qwen2.5 series LLM model?

#142 opened Feb 27, 2025 by white-wolf-tech

the consistency between modelopt and trt is quite different?

#140 opened Feb 24, 2025 by 666DZY666

trt-modelopt is not compatible with pytorch FSDP？

#139 opened Feb 21, 2025 by Vieeo

Restore functionality: lm_head option to disable quantization

#138 opened Feb 20, 2025 by michaelfeil

UNet cannot be quantized with assertion error

#136 opened Feb 19, 2025 by 4a16dick

can modelopt export trt calib cache？

#135 opened Feb 19, 2025 by 666DZY666

modelopt2trt：model inference time is much slower!

#134 opened Feb 19, 2025 by 666DZY666

More modes for model opt quantization than halving the batch size

#133 opened Feb 18, 2025 by michaelfeil

Support SequenceClassifcation Models

#132 opened Feb 18, 2025 by michaelfeil

How to sparse the specified model parameters

#130 opened Feb 13, 2025 by Vieeo

INT8 accuracy drops to zero with explicit quantization

#126 opened Feb 6, 2025 by sriram487

Issue while running modelopt on Jetson Orin

#124 opened Jan 31, 2025 by dudeperf3ct

How to do sparsity and quantization-aware-training ?

#122 opened Jan 23, 2025 by Vieeo

Quantization Benchmark on different model architectures -- particularly MHA

#120 opened Jan 3, 2025 by YixuanSeanZhou

quantize resnet18 onnx with fp8

#118 opened Dec 20, 2024 by thishome

[ONNX][PTQ] Quantization failed with --dq_only flag in ConvTranspose

#117 opened Dec 19, 2024 by ry3s

Asymmetric quantization for Activation?

#115 opened Dec 15, 2024 by yokosyun

What is difference of torch.quantization and onnx.quantization for speed and accuracy ?

#114 opened Dec 11, 2024 by demuxin

Is there a plan to support more recent PTQ methods for INT8 ViT?

#113 opened Dec 10, 2024 by dedoogong

Question aboutf use QuantizeLinear Node with my custom op

#112 opened Dec 10, 2024 by AnnaTrainingG

AssertionError: We only support fp8 for SDXL on Level 4

#109 opened Nov 25, 2024 by wxsms

[RFC] TensorRT Model Optimizer - Product Roadmap roadmap

#108 opened Nov 21, 2024 by hchings

how to reduce memory usage?

#107 opened Nov 20, 2024 by dedoogong

Previous 1 2 3 4 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-02-04.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly