nvfp4_tensor

实现NVFP4量化,用于高效的张量存储和计算。

NVFP4QTensor

未实现。

class NVFP4QTensor

基础类: BaseQuantizedTensor

未实现。

dequantize(dtype=torch.float16, **kwarg)

未实现。

Parameters:

dtype (dtype) –

classmethod get_activation_scaling_factor(quantizer)

未实现。

classmethod get_weights_scaling_factor(input, block_size, weights_scaling_factor_2=None, keep_high_precision=False)

未实现。

Parameters:
  • 输入 (张量) –

  • block_size (int) –

  • weights_scaling_factor_2 (Tensor | None) –

  • keep_high_precision (bool) –

classmethod get_weights_scaling_factor_2(input)

未实现。

Parameters:

输入 (张量) –

classmethod quantize(input, block_size, weights_scaling_factor=None, weights_scaling_factor_2=None, keep_high_precision=False)

未实现。

Parameters:
  • 输入 (张量) –

  • block_size (int) –

  • weights_scaling_factor (Tensor | None) –

  • weights_scaling_factor_2 (Tensor | None) –

  • keep_high_precision (bool) –

classmethod resmooth_weights_and_get_scales(merged_weights, pre_quant_scales, ranks, group_size, avg_pre_quant_scale=None)

未实现。

Parameters:
  • merged_weights (张量) –

  • pre_quant_scales (List[Tensor]) –

  • ranks (int) –

  • group_size (int) –

  • avg_pre_quant_scale (Tensor) –