torch_tensorrt.fx¶

函数¶

torch_tensorrt.fx.compile(module: Module, input, min_acc_module_size: int = 10, max_batch_size: int = 2048, max_workspace_size=33554432, explicit_batch_dimension=False, lower_precision=LowerPrecision.FP16, verbose_log=False, timing_cache_prefix='', save_timing_cache=False, cuda_graph_batch_size=- 1, dynamic_batch=True, is_aten=False, use_experimental_fx_rt=False, correctness_atol=0.1, correctness_rtol=0.1) → Module[source]¶

接收原始模块、输入和降低设置，运行降低工作流以将模块转换为降低后的模块，即所谓的TRTModule。

Parameters

module – 用于降低的原始模块。
input – 模块的输入。
max_batch_size – 最大批次大小（必须 >= 1 才能设置，0 表示未设置）
min_acc_module_size – 加速子模块的最小节点数
max_workspace_size – 分配给TensorRT的工作空间的最大大小。
explicit_batch_dimension – 如果设置为True，则在TensorRT中使用显式批次维度，否则使用隐式批次维度。
lower_precision – 提供给TRTModule的lower_precision配置。
verbose_log – 如果设置为True，则为TensorRT启用详细日志。
timing_cache_prefix – fx2trt 使用的时序缓存文件的名称。
save_timing_cache – 如果设置为True，则使用当前的时间缓存数据更新时间缓存。
cuda_graph_batch_size – Cuda图批处理大小，默认为-1。
dynamic_batch – 批次维度（dim=0）是动态的。
use_experimental_fx_rt – 使用下一代TRTModule，它支持基于Python和TorchScript的执行（包括在C++中）。

Returns

一个由TensorRT降低的torch.nn.Module。

类¶

class torch_tensorrt.fx.TRTModule(engine=None, input_names=None, output_names=None, cuda_graph_batch_size=- 1)[source]¶

class torch_tensorrt.fx.InputTensorSpec(shape: Sequence[int], dtype: dtype, device: device = device(type='cpu'), shape_ranges: List[Tuple[Sequence[int], Sequence[int], Sequence[int]]] = [], has_batch_dim: bool = True)[source]¶

该类包含输入张量的信息。

形状：张量的形状。

dtype: 张量的数据类型。

device: device of the tensor. This is only used to generate inputs to the given model: 为了运行形状属性。对于TensorRT引擎，输入必须位于cuda设备上。
shape_ranges: If dynamic shape is needed (shape has dimensions of -1), then this field: 必须提供（默认为空列表）。每个shape_range是一个由三个元组组成的元组（(min_input_shape), (optimized_input_shape), (max_input_shape)）。每个shape_range用于填充TensorRT优化配置文件。例如，如果输入形状从(1, 224)变化到(100, 224)，并且我们希望优化(25, 224)，因为这是最常见的输入形状，那么我们将shape_ranges设置为((1, 224), (25, 225), (100, 224))。
has_batch_dim: Whether the shape includes batch dimension. Batch dimension has to be provided: 如果引擎想要以动态形状运行。

class torch_tensorrt.fx.TRTInterpreter(module: GraphModule, input_specs: List[InputTensorSpec], explicit_batch_dimension: bool = False, explicit_precision: bool = False, logger_level=None)[source]¶

class torch_tensorrt.fx.TRTInterpreterResult(engine, input_names, output_names, serialized_cache)[source]¶