TensorRT Model Optimizer

入门指南

概述
安装
快速入门：量化
快速入门：量化（Windows）
快速入门：剪枝
快速入门：蒸馏
快速入门：稀疏性

指南

支持矩阵
量化
剪枝
NAS
蒸馏
Sparsity
Saving & Restoring
推测解码

部署

TensorRT-LLM 部署
DirectML 部署

示例

所有GitHub示例
ResNet20在CIFAR-10上的剪枝
HF BERT: 剪枝、蒸馏与量化

参考

更新日志
modelopt API
- deploy
- onnx
  - op_types
  - quantization
  - utils
- torch

支持

联系我们
常见问题解答

TensorRT Model Optimizer

modelopt API
onnx
quantization
gs_patching
View page source

gs_patching

修补 onnx_graphsurgeon 以支持显式设置数据类型。

函数

patch_gs_modules

动态修补graphsurgeon模块。

patch_gs_modules(): 动态修补graphsurgeon模块。

© 版权所有 2023-2024，NVIDIA 公司。

Built with Sphinx using a theme provided by Read the Docs.