基于标签的初始化器
- class LabelBasedInitializer(labels: Sequence[str], encoder: str | TextEncoder | type[TextEncoder] | None = None, encoder_kwargs: Mapping[str, Any] | None = None, batch_size: int | None = None)[source]
-
使用来自transformers库的预训练模型来编码标签的初始化器。
示例用法:
初始化实体表示作为其标签的Transformer编码。之后,参数从标签中分离,并在KGE任务上进行训练,不再与Transformer模型有任何进一步的连接。
from pykeen.datasets import get_dataset from pykeen.nn.init import LabelBasedInitializer from pykeen.models import ERMLPE dataset = get_dataset(dataset="nations") entity_initializer = LabelBasedInitializer.from_triples_factory( triples_factory=dataset.training, encoder="transformer", ) model = ERMLPE( triples_factory=dataset.training, embedding_dim=entity_initializer.tensor.shape[-1], # 768 for BERT base entity_initializer=entity_initializer, # note: we explicitly need to provide a relation initializer here, # since ERMLPE shares initializers between entities and relations by default relation_initializer="uniform", )
初始化初始化器。
- Parameters:
encoder (str | TextEncoder | type[TextEncoder] | None) – 要使用的文本编码器,参见 text_encoder_resolver
encoder_kwargs (Mapping[str, Any] | None) – 传递给编码器的额外基于关键字的参数
batch_size (int | None) – >0 编码时使用的(最大)批次大小。如果为None,则使用len(labels),即仅使用单个批次。
方法总结
from_triples_factory(triples_factory[, ...])使用来自三元组工厂的标签准备一个基于标签的初始化器。
方法文档
- classmethod from_triples_factory(triples_factory: TriplesFactory, for_entities: bool = True, **kwargs) LabelBasedInitializer[来源]
使用来自三元组工厂的标签准备一个基于标签的初始化器。
- Parameters:
triples_factory (TriplesFactory) – 三元组工厂
for_entities (bool) – 是否为实体(或关系)创建初始化器
kwargs – 传递给
LabelBasedInitializer.__init__()的额外基于关键字的参数
- Returns:
基于标签的初始化器
- Raises:
ImportError – 如果无法导入transformers库
- Return type: