layer_utils

用于model_config导出的工具。

此文件中的一些逻辑是经验性的，如果出现异常，需要不断更新。

函数

`build_attention_config`	从模块构建注意力配置。
`build_conv_config`	构建此模块的卷积配置。
`build_decoder_config`	从模块构建完整的解码器配置。
`build_embedding_config`	从模块构建嵌入配置。
`build_layernorm_config`	从模块构建layernorm配置。
`build_linear_config`	为模块构建线性配置。
`build_medusa_heads_config`	如果存在，构建一个MedusaHeadConfig列表。
`build_mlp_config`	为模块构建MLP配置。
`build_moe_config`	为模块构建MOE配置。
`build_qkv`	将qkv模块转换为配置。
`build_recurrent_config`	构建此模块的循环配置。
`build_stacked_experts`	为专家构建experts_weight_1和experts_weight_2配置。
`check_model_compatibility`	返回模块列表是否与导出逻辑兼容。
`dup_kv_weight`	如果 tp_size 大于 num_kv_heads，则重复 kv 头。
`get_dtype`	返回模型的默认数据类型。
`get_experts_linear_names`	根据MoE模型的解码器类型返回线性层名称。
`get_transformer_layers`	返回变压器模型的根模块。
`is_attention`	返回模块是否为注意力层。
`is_decoder_list`	返回模块是否为解码器列表。
`is_embedding`	返回模块是否为嵌入层。
`is_layernorm`	返回模块是否为层归一化层。
`is_linear`	返回模块是否为线性层。
`is_mlp`	返回模块是否为MLP层。
`is_moe`	返回模块是否为MOE层。
`is_quantlinear`	返回模块是否为量化线性层。
`is_recurrent`	返回模块是否为循环层。
`update_experts_avg_prequant_scale`	将每个专家的experts_pre_quant_scale属性注册为专家之间的平均pre_quant_scale。

build_attention_config(module, model_metadata_config, dtype, ext_config=None, tp_size=1)

从模块构建注意力配置。

Parameters:

模块 (模块) –
dtype (dtype) –
ext_config (DecoderLayerConfig) –
tp_size (int) –

Return type:

AttentionConfig

build_conv_config(module, dtype)

构建此模块的卷积配置。

Parameters:

模块 (模块) –
dtype (dtype) –

Return type:

ConvConfig

build_decoder_config(module, model_metadata_config, decoder_type, dtype, tp_size=1)

从模块构建完整的解码器配置。

Parameters:

模块 (模块) –
decoder_type (str) –
dtype (dtype) –
tp_size (int) –

Return type:

DecoderLayerConfig

build_embedding_config(module, dtype, normalization_constant=1)

从模块构建嵌入配置。

Parameters:

模块 (模块) –
dtype (dtype) –
normalization_constant (float) –

Return type:

EmbeddingConfig

build_layernorm_config(module, dtype)

从模块构建layernorm配置。

Parameters:

模块 (模块) –
dtype (dtype) –

Return type:

LayernormConfig

build_linear_config(module, linear_type, dtype)

为模块构建线性配置。

Parameters:

模块 (模块) –
linear_type (str) –
dtype (dtype) –

Return type:

LinearConfig

build_medusa_heads_config(model, dtype)

如果存在，构建一个MedusaHeadConfig列表。

根据TensorRT-LLM的Medusa实现，所有的Medusa头（num_medusa_heads）应该放在一个名为‘medsua_heads’的‘torch.nn.ModuleList’中。一个Medusa头包含一个额外的‘lm_head’（vocab_size, hidden_size）和一个Medusa层（LinearActConfig）的列表（num_medusa_layers）。该层唯一支持的hidden_act是‘silu’。所有的线性层都是列并行的。

Parameters:

model (Module | None) –
dtype (dtype) –

Return type:

列表[MedusaHeadConfig] | 无

build_mlp_config(module, decoder_type, dtype, hidden_act=None)

为模块构建MLP配置。

Parameters:

模块 (模块) –
dtype (dtype) –
hidden_act (str | None) –

Return type:

MLPConfig

build_moe_config(module, decoder_type, dtype)

为模块构建MOE配置。

Parameters:

模块 (模块) –
dtype (dtype) –

Return type:

MOEConfig

build_qkv(qkv_modules, model_metadata_config, dtype, ext_config=None, tp_size=1)

将qkv模块转换为配置。

Parameters:

qkv_modules (List[Module]) –
dtype (dtype) –
ext_config (DecoderLayerConfig) –
tp_size (int) –

Return type:

QKVConfig

build_recurrent_config(module, dtype)

构建此模块的循环配置。

Parameters:

模块 (模块) –
dtype (dtype) –

build_stacked_experts(experts, dtype, linear_names, num_experts, expert_getter)

为专家构建experts_weight_1和experts_weight_2配置。

Parameters:

专家 (模块) –
dtype (dtype) –
linear_names (List[str]) –

check_model_compatibility(module_list)

返回模块列表是否与导出逻辑兼容。

如果存在位置嵌入和嵌入层归一化。

我们假设模型由一个或两个嵌入层、一个transformer解码器的ModuleList以及一个可选的嵌入层归一化的最终层归一化组成。否则将不支持。

Parameters:: module_list (List[Module]) –
Return type:: 元组[布尔, 布尔, 布尔]

dup_kv_weight(v, head_size, num_head, tp_size)

如果 tp_size 大于 num_kv_heads，则重复 kv 头。

Parameters:

v (张量) –
head_size (int) –
num_head (int) –
tp_size (int) –

Return type:

张量

get_dtype(model): 返回模型的默认数据类型。

get_experts_linear_names(model)

根据MoE模型的解码器类型返回线性层名称。

Parameters:: 模型 (模块) –

get_transformer_layers(model)

返回变压器模型的根模块。

Parameters:: 模型 (模块) –
Return type:: 列表[模块]

is_attention(module)

返回模块是否为注意力层。

Parameters:: 模块 (模块) –
Return type:: bool

is_decoder_list(module)

返回模块是否为解码器列表。

Parameters:: 模块 (模块) –
Return type:: bool

is_embedding(module)

返回模块是否为嵌入层。

Parameters:: 模块 (模块) –
Return type:: bool

is_layernorm(module)

返回模块是否为层归一化层。

Parameters:: 模块 (模块) –
Return type:: bool

is_linear(module)

返回模块是否为线性层。

Parameters:: 模块 (模块) –
Return type:: bool

is_mlp(module)

返回模块是否为MLP层。

Parameters:: 模块 (模块) –
Return type:: bool

is_moe(module)

返回模块是否为MOE层。

Parameters:: 模块 (模块) –
Return type:: bool

is_quantlinear(module)

返回模块是否为量化线性层。

Parameters:: 模块 (模块) –
Return type:: bool

is_recurrent(module)

返回模块是否为循环层。

Parameters:: 模块 (模块) –
Return type:: bool

update_experts_avg_prequant_scale(experts)

将每个专家的experts_pre_quant_scale属性注册为专家之间的平均pre_quant_scale。

Parameters:: 专家 (模块) –