torch_frame.nn.models.ExcelFormer

class ExcelFormer(in_channels: int, out_channels: int, num_cols: int, num_layers: int, num_heads: int, col_stats: dict[str, dict[StatType, Any]], col_names_dict: dict[torch_frame.stype, list[str]], stype_encoder_dict: dict[torch_frame.stype, StypeEncoder] | None = None, diam_dropout: float = 0.0, aium_dropout: float = 0.0, residual_dropout: float = 0.0, mixup: str | None = None, beta: float = 0.5)[source]

基础类: Module

ExcelFormer模型在 “ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data” 论文中介绍。

ExcelFormer首先使用目标统计编码器（即论文中的CatBoostEncoder）将分类特征转换为数值特征。然后，它使用互信息排序对数值特征进行排序。因此，模型本身仅限于数值特征。

注意

有关使用ExcelFormer的示例，请参见examples/excelformer.py。

Parameters:

in_channels (int) – 输入通道维度
out_channels (int) – 输出通道的维度
num_cols (int) – 列数
num_layers (int) – torch_frame.nn.conv.ExcelFormerConv 层的数量。
num_heads (int) – 在DiaM中使用的注意力头数量
col_stats (dict[str,dict[torch_frame.data.stats.StatType,Any]]) – 一个将列名映射到统计信息的字典。可作为 dataset.col_stats 使用。
col_names_dict (dict[torch_frame.stype, list[str]]) – 一个将stype映射到列名列表的字典。列名根据在 tensor_frame.feat_dict中出现的顺序进行排序。可通过 tensor_frame.col_names_dict获取。
stype_encoder_dict – (dict[torch_frame.stype, torch_frame.nn.encoder.StypeEncoder], 可选): 一个将stypes映射到其stype编码器的字典。 (默认: None, 将调用 ExcelFormerEncoder() 用于数值特征)
diam_dropout (float, optional) – diam_dropout. (默认值: 0.0)
aium_dropout (float, optional) – aium_dropout. (默认值: 0.0)
residual_dropout (float, optional) – 残差丢弃率。 (默认值: 0.0)
mixup (str, optional) – mixup 类型。 None, feature, 或 hidden。 (默认: None)
beta (float, 可选) – 用于计算mixup中混洗率的beta分布的形状参数。仅在mixup不为None时有用。（默认值：0.5）

forward(tf: TensorFrame, mixup_encoded: bool = False) → Tensor | tuple[Tensor, Tensor][来源]

将TensorFrame对象转换为输出嵌入。如果mixup_encoded为True，它将按照self.mixup的方式生成输出嵌入以及混合的目标。

Parameters:

tf (torch_frame.TensorFrame) – 输入的 TensorFrame 对象。
mixup_encoded (bool) – 是否在编码的数值特征上进行混合，即 FEAT-MIX 和 HIDDEN-MIX。 (默认: False)

Returns:

输出嵌入的大小: [batch_size, out_channels]。如果 mixup_encoded 是 True，则返回大小为 [batch_size, num_classes] 的混合目标。

Return type:

torch.Tensor | tuple[Tensor, Tensor]