其他

评分器

class
计算评估分数

Scorer用于计算评估分数。它通常由Language.evaluate创建。此外,Scorer还提供多种评估方法来评估TokenDoc属性。

Scorer.__init__ 方法

创建一个新的Scorer

名称描述
nlpThe pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline is constructed using the default_lang and default_pipeline settings. Optional[Language]
default_langThe language to use for a default pipeline if nlp is not provided. Defaults to xx. str
default_pipelineThe pipeline components to use for a default pipeline if nlp is not provided. Defaults to ("senter", "tagger", "morphologizer", "parser", "ner", "textcat"). Iterable[string]
仅关键字
**kwargsAny additional settings to pass on to the individual scoring methods. Any

Scorer.score 方法

使用流水线中各组件提供的评分方法,计算Example对象列表的分数。

返回的Dict包含由各个流水线组件提供的分数。对于由Scorer提供并被核心流水线组件使用的评分方法,各个分数名称以被评分的TokenDoc属性开头:

  • token_acc, token_p, token_r, token_f
  • sents_p, sents_r, sents_f
  • tag_acc
  • pos_acc
  • morph_acc, morph_micro_p, morph_micro_r, morph_micro_f, morph_per_feat
  • lemma_acc
  • dep_uas, dep_las, dep_las_per_type
  • ents_p, ents_r ents_f, ents_per_type
  • spans_sc_p, spans_sc_r, spans_sc_f
  • cats_score (取决于配置,描述见cats_score_desc), cats_micro_p, cats_micro_r, cats_micro_f, cats_macro_p, cats_macro_r, cats_macro_f, cats_macro_auc, cats_f_per_type, cats_auc_per_type
名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
per_component v3.6Whether to return the scores keyed by component name. Defaults to False. bool

Scorer.score_tokenization staticmethodv3.0

对分词进行评分:

  • token_acc: 正确标记数 / 预测标记数
  • token_p, token_r, token_f: 用于标记字符跨度的精确率、召回率和F值

在评分过程中,带有has_unknown_spaces的文档会被跳过。

| 名称 | 描述 | | ----------- | ------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | | examples | 包含预测结果和正确标注数据的Example对象。可迭代[Example] | | 返回值 | Dict | 包含分数token_acctoken_ptoken_rtoken_f的字典。字典[str, float]] |

Scorer.score_token_attr staticmethodv3.0

对单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
attrThe attribute to score. str
仅关键字
getterDefaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any]
missing_valuesAttribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any]

Scorer.score_token_attr_per_feat staticmethodv3.0

针对通用依存关系FEATS格式中的词符属性,对每个特征的单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
attrThe attribute to score. str
仅关键字
getterDefaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any]
missing_valuesAttribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any]

Scorer.score_spans staticmethodv3.0

返回已标注或未标注跨度的PRF分数。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
attrThe attribute to score. str
仅关键字
getterDefaults to getattr. If provided, getter(doc, attr) should return the Span objects for an individual Doc. Callable[[Doc, str], Iterable[Span]]
has_annotationDefaults to None. If provided, has_annotation(doc) should return whether a Doc has annotation for this attr. Docs without annotation are skipped for scoring purposes. str
labeledDefaults to True. If set to False, two spans will be considered equal if their start and end match, irrespective of their label. bool
allow_overlapDefaults to False. Whether or not to allow overlapping spans. If set to False, the alignment will automatically resolve conflicts. bool

Scorer.score_deps 静态方法v3.0

计算依存句法分析的UAS、LAS及每种类型的LAS得分。在评分过程中,会跳过attr(通常是dep)属性值为空的词元。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
attrThe attribute to score. str
仅关键字
getterDefaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any]
head_attrThe attribute containing the head token. str
head_getterDefaults to getattr. If provided, head_getter(token, attr) should return the head for an individual Token. Callable[[Doc, str],Token]
ignore_labelsLabels to ignore while scoring (e.g. "punct"). Iterable[str]
missing_valuesAttribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any]

Scorer.score_cats 静态方法v3.0

计算文档级别属性的PRF和ROC AUC分数,该属性是一个包含每个标签分数的字典,例如Doc.cats。返回的字典包含以下分数:

  • {attr}_micro_p, {attr}_micro_r{attr}_micro_f: 每个标签下的每个实例具有相同权重
  • {attr}_macro_p, {attr}_macro_r{attr}_macro_f: 每个标签评估结果的平均值
  • {attr}_f_per_type{attr}_auc_per_type: 每个都包含一个分数字典,按标签键控
  • 最终的 {attr}_score 和对应的 {attr}_score_desc (文本描述)

报告的{attr}_score取决于分类属性:

  • 二元排他性正标签: {attr}_score 设置为正标签的F分数
  • 3+ 独家课程, 宏观平均F分数: {attr}_score = {attr}_macro_f
  • 多标签, 宏平均AUC: {attr}_score = {attr}_macro_auc
名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
attrThe attribute to score. str
仅关键字
getterDefaults to getattr. If provided, getter(doc, attr) should return the cats for an individual Doc. Callable[[Doc, str], Dict[str, float]]
labelsThe set of possible labels. Defaults to []. Iterable[str]
multi_labelWhether the attribute allows multiple labels. Defaults to True. When set to False (exclusive labels), missing gold labels are interpreted as 0.0 and the threshold is set to 0.0. bool
positive_labelThe positive label for a binary task with exclusive classes. Defaults to None. Optional[str]
thresholdCutoff to consider a prediction “positive”. Defaults to 0.5 for multi-label, and 0.0 (i.e. whatever’s highest scoring) otherwise. float

get_ner_prf v3.0

计算微观PRF及每个实体的PRF分数。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]

score_coref_clusters 实验性

返回核心指代簇的LEA (Moosavi and Strube, 2016) PRF评分。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
span_cluster_prefixThe prefix used for spans representing coreference clusters. str

score_span_predictions 实验性

返回从单个标记重建跨度的准确率。只有完全正确的预测才被视为正确,近似答案不计入部分得分。由SpanResolver使用。

名称描述
examplesThe Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
output_prefixThe prefix used for spans representing the final predicted spans. str