TensorRT Model Optimizer

入门指南

概述
安装
快速入门：量化
快速入门：量化（Windows）
快速入门：剪枝
快速入门：蒸馏
快速入门：稀疏性

指南

支持矩阵
量化
剪枝
NAS
蒸馏
Sparsity
Saving & Restoring
推测解码

部署

TensorRT-LLM 部署
DirectML 部署

示例

所有GitHub示例
ResNet20在CIFAR-10上的剪枝
HF BERT: 剪枝、蒸馏与量化

参考

更新日志
modelopt API
- deploy
- onnx
- torch
  - distill
  - export
  - nas
  - opt
  - prune
  - quantization
    - calib
    - config
    - conversion
    - export_onnx
    - extensions
    - mode
    - model_calib
    - model_quant
    - nn
    - optim
    - plugins
    - qtensor
    - quant_modules
    - tensor_quant
    - utils
  - 稀疏性
  - speculative
  - trace
  - utils

支持

联系我们
常见问题解答

TensorRT Model Optimizer

modelopt API
torch
quantization
nn
modules
quant_batchnorm
View page source

quant_batchnorm

量化批量归一化模块。

© 版权所有 2023-2024，NVIDIA 公司。

Built with Sphinx using a theme provided by Read the Docs.