-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update documentation for 0.15.0 release
- Loading branch information
1 parent
822d7c6
commit 6de9560
Showing
277 changed files
with
11,577 additions
and
2,729 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 89ada319c94fcb1610b7f80d777e8b12 | ||
config: 0ea2334c76c1e774d577e20446a79224 | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+12.3 KB
(140%)
.doctrees/reference/generated/modelopt.deploy.llm.generate.doctree
Binary file not shown.
Binary file removed
BIN
-51.3 KB
.doctrees/reference/generated/modelopt.deploy.llm.model_config_trt.doctree
Binary file not shown.
Binary file modified
BIN
-6.38 KB
(93%)
.doctrees/reference/generated/modelopt.onnx.op_types.doctree
Binary file not shown.
Binary file modified
BIN
+6.12 KB
(120%)
.doctrees/reference/generated/modelopt.onnx.quantization.calib_utils.doctree
Binary file not shown.
Binary file modified
BIN
+3.82 KB
(120%)
.doctrees/reference/generated/modelopt.onnx.quantization.doctree
Binary file not shown.
Binary file added
BIN
+5.71 KB
.doctrees/reference/generated/modelopt.onnx.quantization.extensions.doctree
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+23.1 KB
(130%)
.doctrees/reference/generated/modelopt.onnx.quantization.graph_utils.doctree
Binary file not shown.
Binary file modified
BIN
-354 Bytes
(100%)
.doctrees/reference/generated/modelopt.onnx.quantization.int4.doctree
Binary file not shown.
Binary file added
BIN
+30.8 KB
.doctrees/reference/generated/modelopt.onnx.quantization.int8.doctree
Binary file not shown.
Binary file modified
BIN
+355 Bytes
(100%)
.doctrees/reference/generated/modelopt.onnx.quantization.ort_patching.doctree
Binary file not shown.
Binary file modified
BIN
+12.7 KB
(220%)
.doctrees/reference/generated/modelopt.onnx.quantization.ort_utils.doctree
Binary file not shown.
Binary file modified
BIN
+6 KB
(110%)
.doctrees/reference/generated/modelopt.onnx.quantization.qdq_utils.doctree
Binary file not shown.
Binary file modified
BIN
+7.51 KB
(150%)
.doctrees/reference/generated/modelopt.onnx.quantization.quant_utils.doctree
Binary file not shown.
Binary file modified
BIN
+3.8 KB
(110%)
.doctrees/reference/generated/modelopt.onnx.quantization.quantize.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added
BIN
+26.6 KB
.doctrees/reference/generated/modelopt.torch.distill.distillation.doctree
Binary file not shown.
Binary file added
BIN
+53.5 KB
.doctrees/reference/generated/modelopt.torch.distill.distillation_model.doctree
Binary file not shown.
Binary file not shown.
Binary file added
BIN
+38.8 KB
.doctrees/reference/generated/modelopt.torch.distill.loss_balancers.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added
BIN
+5.82 KB
.doctrees/reference/generated/modelopt.torch.distill.registry.doctree
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
-13.3 KB
(78%)
.doctrees/reference/generated/modelopt.torch.export.distribute.doctree
Binary file not shown.
Binary file modified
BIN
+2.54 KB
(110%)
.doctrees/reference/generated/modelopt.torch.export.doctree
Binary file not shown.
Binary file added
BIN
+5.75 KB
.doctrees/reference/generated/modelopt.torch.export.hf_config_map.doctree
Binary file not shown.
Binary file modified
BIN
+34.6 KB
(120%)
.doctrees/reference/generated/modelopt.torch.export.layer_utils.doctree
Binary file not shown.
Binary file modified
BIN
+218 KB
(160%)
.doctrees/reference/generated/modelopt.torch.export.model_config.doctree
Binary file not shown.
Binary file modified
BIN
+10.1 KB
(120%)
.doctrees/reference/generated/modelopt.torch.export.model_config_export.doctree
Binary file not shown.
Binary file modified
BIN
+2.94 KB
(110%)
.doctrees/reference/generated/modelopt.torch.export.scaling_factor_utils.doctree
Binary file not shown.
Binary file modified
BIN
+13.6 KB
(160%)
.doctrees/reference/generated/modelopt.torch.export.tensorrt_llm_utils.doctree
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+4.8 KB
(110%)
.doctrees/reference/generated/modelopt.torch.opt.hparam.doctree
Binary file not shown.
Binary file modified
BIN
+5.83 KB
(110%)
.doctrees/reference/generated/modelopt.torch.opt.searcher.doctree
Binary file not shown.
Binary file modified
BIN
+4.29 KB
(120%)
.doctrees/reference/generated/modelopt.torch.opt.utils.doctree
Binary file not shown.
Binary file added
BIN
+72.5 KB
.doctrees/reference/generated/modelopt.torch.quantization.algorithms.doctree
Binary file not shown.
Binary file modified
BIN
+456 Bytes
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.calib.histogram.doctree
Binary file not shown.
Binary file modified
BIN
+456 Bytes
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.calib.max.doctree
Binary file not shown.
Binary file modified
BIN
+148 KB
(490%)
.doctrees/reference/generated/modelopt.torch.quantization.config.doctree
Binary file not shown.
Binary file modified
BIN
+15.8 KB
(140%)
.doctrees/reference/generated/modelopt.torch.quantization.conversion.doctree
Binary file not shown.
Binary file modified
BIN
+2.53 KB
(110%)
.doctrees/reference/generated/modelopt.torch.quantization.doctree
Binary file not shown.
Binary file modified
BIN
+6.16 KB
(210%)
.doctrees/reference/generated/modelopt.torch.quantization.extensions.doctree
Binary file not shown.
Binary file modified
BIN
+3.05 KB
(110%)
.doctrees/reference/generated/modelopt.torch.quantization.model_calib.doctree
Binary file not shown.
Binary file modified
BIN
+42 KB
(190%)
.doctrees/reference/generated/modelopt.torch.quantization.model_quant.doctree
Binary file not shown.
Binary file modified
BIN
+1.3 KB
(110%)
.doctrees/reference/generated/modelopt.torch.quantization.nn.modules.doctree
Binary file not shown.
Binary file modified
BIN
+1.66 KB
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.nn.modules.quant_conv.doctree
Binary file not shown.
Binary file modified
BIN
+284 Bytes
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.nn.modules.quant_linear.doctree
Binary file not shown.
Binary file modified
BIN
+870 Bytes
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.nn.modules.quant_module.doctree
Binary file not shown.
Binary file added
BIN
+86 KB
.doctrees/reference/generated/modelopt.torch.quantization.nn.modules.quant_rnn.doctree
Binary file not shown.
Binary file modified
BIN
+5.1 KB
(100%)
...trees/reference/generated/modelopt.torch.quantization.nn.modules.tensor_quantizer.doctree
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
.doctrees/reference/generated/modelopt.torch.quantization.plugins.doctree
Binary file not shown.
Binary file added
BIN
+36.4 KB
.doctrees/reference/generated/modelopt.torch.quantization.qtensor.base_qtensor.doctree
Binary file not shown.
Binary file added
BIN
+11.1 KB
.doctrees/reference/generated/modelopt.torch.quantization.qtensor.doctree
Binary file not shown.
Binary file added
BIN
+21.4 KB
.doctrees/reference/generated/modelopt.torch.quantization.qtensor.int4_tensor.doctree
Binary file not shown.
Binary file added
BIN
+28.1 KB
.doctrees/reference/generated/modelopt.torch.quantization.qtensor.nf4_tensor.doctree
Binary file not shown.
Binary file modified
BIN
-48.6 KB
(65%)
.doctrees/reference/generated/modelopt.torch.quantization.tensor_quant.doctree
Binary file not shown.
Binary file modified
BIN
+1.57 KB
(100%)
.doctrees/reference/generated/modelopt.torch.utils.dataset_utils.doctree
Binary file not shown.
Binary file modified
BIN
+7.07 KB
(120%)
.doctrees/reference/generated/modelopt.torch.utils.distributed.doctree
Binary file not shown.
Binary file modified
BIN
+5.4 KB
(100%)
.doctrees/reference/generated/modelopt.torch.utils.network.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
.doctrees/environment.pickle filter=lfs diff=lfs merge=lfs -text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
All ModelOpt Examples | ||
===================== | ||
GitHub Examples | ||
=============== | ||
|
||
Please visit the `TensorRT-Model-Optimizer GitHub repository <https://github.com/NVIDIA/TensorRT-Model-Optimizer>`_ | ||
for all ModelOpt examples. | ||
All examples can be accessed from the ModelOpt GitHub repository at | ||
`github.com/NVIDIA/TensorRT-Model-Optimizer <https://github.com/NVIDIA/TensorRT-Model-Optimizer/>`_. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
|
||
========================= | ||
Quick Start: Distillation | ||
========================= | ||
|
||
ModelOpt's :doc:`Distillation <../guides/4_distillation>` is a set of wrappers and utilities | ||
to easily perform Knowledge Distillation among teacher and student models. | ||
Given a pretrained teacher model, Distillation has the potential to train a smaller student model | ||
faster and/or with higher accuracy than the student model could achieve on its own. | ||
|
||
This quick-start guide shows the necessary steps to integrate Distillation into your | ||
training pipeline. | ||
|
||
Set up your base models | ||
----------------------- | ||
|
||
First obtain both a pretrained model to act as the teacher and a (usualy smaller) model to serve | ||
as the student. | ||
|
||
.. code-block:: python | ||
from torchvision.models import resnet50, resnet18 | ||
# Define student | ||
student_model = resnet18() | ||
# Define callable which returns teacher | ||
def teacher_factory(): | ||
teacher_model = resnet50() | ||
teacher_model.load_state_dict(pretrained_weights) | ||
return teacher_model | ||
Set up the meta model | ||
--------------------- | ||
|
||
As Knowledge Distillation involves (at least) two models, ModelOpt simplifies the integration | ||
process by wrapping both student and teacher into one meta model. | ||
|
||
Please see an example Distillation setup below. This example assumes the outputs | ||
of ``teacher_model`` and ``student_model`` are logits. | ||
|
||
.. code-block:: python | ||
import modelopt.torch.distill as mtd | ||
distillation_config = { | ||
"teacher_model": teacher_factory, # model initializer | ||
"criterion": mtd.LogitsDistillationLoss(), # callable receiving student and teacher outputs, in order | ||
"loss_balancer": mtd.StaticLossBalancer(), # combines multiple losses; omit if only one distillation loss used | ||
} | ||
distillation_model = mtd.convert(student_model, mode=[("kd_loss", distillation_config)]) | ||
The ``teacher_model`` can be either a callable which returns an ``nn.Module`` or a tuple of ``(model_cls, args, kwargs)``. | ||
The ``criterion`` is the distillation loss used between student and teacher tensors. | ||
The ``loss_balancer`` determines how the original and distillation losses are combined (if needed). | ||
|
||
See :doc:`Distillation <../guides/4_distillation>` for more info. | ||
|
||
|
||
Distill during training | ||
----------------------- | ||
|
||
To Distill from teacher to student, simply use the meta model in the usual training loop, while | ||
also using the meta model's ``.compute_kd_loss()`` method to compute the distillation loss, in addition to | ||
the original user loss. | ||
|
||
An example of Distillation training is given below: | ||
|
||
.. code-block:: python | ||
:emphasize-lines: 14 | ||
# Setup the data loaders. As example: | ||
train_loader = get_train_loader() | ||
# Define user loss function. As example: | ||
loss_fn = get_user_loss_fn() | ||
for input, labels in train_dataloader: | ||
distillation_model.zero_grad() | ||
# Forward through the wrapped models | ||
out = distillation_model(input) | ||
# Same loss as originally present | ||
loss = loss_fn(out, labels) | ||
# Combine distillation and user losses | ||
loss_total = distillation_model.compute_kd_loss(student_loss=loss) | ||
loss_total.backward() | ||
.. note:: | ||
`DataParallel <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`_ may | ||
break ModelOpt's Distillation feature. | ||
Note that `HuggingFace Trainer <https://huggingface.co/docs/transformers/en/main_classes/trainer>`_ | ||
uses DataParallel by default. | ||
|
||
|
||
Export trained model | ||
-------------------- | ||
|
||
The model can easily be reverted to its original class for further use (i.e deployment) | ||
without any ModelOpt modifications attached. | ||
|
||
.. code-block:: python | ||
model = mtd.export(distillation_model) | ||
-------------------------------- | ||
|
||
**Next steps** | ||
* Learn more about :doc:`Distillation <../guides/4_distillation>`. | ||
* See ModelOpt's :doc:`API documentation <../reference/1_modelopt_api>` for detailed | ||
functionality and usage information. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.