评分器

class

计算评估分数

Scorer用于计算评估分数。它通常由Language.evaluate创建。此外，Scorer还提供多种评估方法来评估Token和Doc属性。

Scorer.init 方法

创建一个新的Scorer。

名称	描述
`nlp`	The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline is constructed using the `default_lang` and `default_pipeline` settings. Optional[Language]
`default_lang`	The language to use for a default pipeline if `nlp` is not provided. Defaults to `xx`. str
`default_pipeline`	The pipeline components to use for a default pipeline if `nlp` is not provided. Defaults to `("senter", "tagger", "morphologizer", "parser", "ner", "textcat")`. Iterable[string]
仅关键字
`**kwargs`	Any additional settings to pass on to the individual scoring methods. Any

Scorer.score 方法

使用流水线中各组件提供的评分方法，计算Example对象列表的分数。

返回的Dict包含由各个流水线组件提供的分数。对于由Scorer提供并被核心流水线组件使用的评分方法，各个分数名称以被评分的Token或Doc属性开头：

token_acc, token_p, token_r, token_f
sents_p, sents_r, sents_f
tag_acc
pos_acc
morph_acc, morph_micro_p, morph_micro_r, morph_micro_f, morph_per_feat
lemma_acc
dep_uas, dep_las, dep_las_per_type
ents_p, ents_r ents_f, ents_per_type
spans_sc_p, spans_sc_r, spans_sc_f
cats_score (取决于配置，描述见cats_score_desc), cats_micro_p, cats_micro_r, cats_micro_f, cats_macro_p, cats_macro_r, cats_macro_f, cats_macro_auc, cats_f_per_type, cats_auc_per_type

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
`per_component` v3.6	Whether to return the scores keyed by component name. Defaults to `False`. bool
返回值	包含得分的字典。 Dict[str, Union[float, Dict[str, float]]]

Scorer.score_tokenization staticmethodv3.0

对分词进行评分：

token_acc: 正确标记数 / 预测标记数
token_p, token_r, token_f: 用于标记字符跨度的精确率、召回率和F值

在评分过程中，带有has_unknown_spaces的文档会被跳过。

| 名称 | 描述 | | ----------- | ------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | | examples | 包含预测结果和正确标注数据的Example对象。可迭代[Example] | | 返回值 | Dict | 包含分数token_acc、token_p、token_r、token_f的字典。字典[str, float]] |

Scorer.score_token_attr staticmethodv3.0

对单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
`attr`	The attribute to score. str
仅关键字
`getter`	Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. Callable[[Token, str], Any]
`missing_values`	Attribute values to treat as missing annotation in the reference annotation. Defaults to `{0, None, ""}`. Set[Any]
RETURNS	A dictionary containing the score `{attr}_acc`. Dict[str, float]

Scorer.score_token_attr_per_feat staticmethodv3.0

针对通用依存关系FEATS格式中的词符属性，对每个特征的单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
`attr`	The attribute to score. str
仅关键字
`getter`	Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. Callable[[Token, str], Any]
`missing_values`	Attribute values to treat as missing annotation in the reference annotation. Defaults to `{0, None, ""}`. Set[Any]
RETURNS	A dictionary containing the micro PRF scores under the key `{attr}_micro_p/r/f` and the per-feature PRF scores under `{attr}_per_feat`. Dict[str, Dict[str, float]]

Scorer.score_spans staticmethodv3.0

返回已标注或未标注跨度的PRF分数。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
`attr`	The attribute to score. str
仅关键字
`getter`	Defaults to `getattr`. If provided, `getter(doc, attr)` should return the `Span` objects for an individual `Doc`. Callable[[Doc, str], Iterable[Span]]
`has_annotation`	Defaults to `None`. If provided, `has_annotation(doc)` should return whether a `Doc` has annotation for this `attr`. Docs without annotation are skipped for scoring purposes. str
`labeled`	Defaults to `True`. If set to `False`, two spans will be considered equal if their start and end match, irrespective of their label. bool
`allow_overlap`	Defaults to `False`. Whether or not to allow overlapping spans. If set to `False`, the alignment will automatically resolve conflicts. bool
RETURNS	A dictionary containing the PRF scores under the keys `{attr}_p`, `{attr}_r`, `{attr}_f` and the per-type PRF scores under `{attr}_per_type`. Dict[str, Union[float, Dict[str, float]]]

Scorer.score_deps 静态方法v3.0

计算依存句法分析的UAS、LAS及每种类型的LAS得分。在评分过程中，会跳过attr（通常是dep）属性值为空的词元。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
`attr`	The attribute to score. str
仅关键字
`getter`	Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. Callable[[Token, str], Any]
`head_attr`	The attribute containing the head token. str
`head_getter`	Defaults to `getattr`. If provided, `head_getter(token, attr)` should return the head for an individual `Token`. Callable[[Doc, str],Token]
`ignore_labels`	Labels to ignore while scoring (e.g. `"punct"`). Iterable[str]
`missing_values`	Attribute values to treat as missing annotation in the reference annotation. Defaults to `{0, None, ""}`. Set[Any]
RETURNS	A dictionary containing the scores: `{attr}_uas`, `{attr}_las`, and `{attr}_las_per_type`. Dict[str, Union[float, Dict[str, float]]]

Scorer.score_cats 静态方法v3.0

计算文档级别属性的PRF和ROC AUC分数，该属性是一个包含每个标签分数的字典，例如Doc.cats。返回的字典包含以下分数：

{attr}_micro_p, {attr}_micro_r 和 {attr}_micro_f: 每个标签下的每个实例具有相同权重
{attr}_macro_p, {attr}_macro_r 和 {attr}_macro_f: 每个标签评估结果的平均值
{attr}_f_per_type 和 {attr}_auc_per_type: 每个都包含一个分数字典，按标签键控
最终的 {attr}_score 和对应的 {attr}_score_desc (文本描述)

报告的{attr}_score取决于分类属性：

二元排他性正标签： {attr}_score 设置为正标签的F分数
3+ 独家课程, 宏观平均F分数: {attr}_score = {attr}_macro_f
多标签, 宏平均AUC: {attr}_score = {attr}_macro_auc

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
`attr`	The attribute to score. str
仅关键字
`getter`	Defaults to `getattr`. If provided, `getter(doc, attr)` should return the cats for an individual `Doc`. Callable[[Doc, str], Dict[str, float]]
labels	The set of possible labels. Defaults to `[]`. Iterable[str]
`multi_label`	Whether the attribute allows multiple labels. Defaults to `True`. When set to `False` (exclusive labels), missing gold labels are interpreted as `0.0` and the threshold is set to `0.0`. bool
`positive_label`	The positive label for a binary task with exclusive classes. Defaults to `None`. Optional[str]
`threshold`	Cutoff to consider a prediction “positive”. Defaults to `0.5` for multi-label, and `0.0` (i.e. whatever’s highest scoring) otherwise. float
RETURNS	A dictionary containing the scores, with inapplicable scores as `None`. Dict[str, Optional[float]]

Scorer.score_links staticmethodv3.0

返回实体级别上预测链接的PRF（精确率、召回率、F1值）。为了将NEL（实体链接）与NER（实体识别）的性能区分开，该方法仅评估黄金标准与预测结果中重叠实体的NEL链接。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
`negative_labels`	The string values that refer to no annotation (e.g. “NIL”). Iterable[str]
返回值	包含分数的字典。Dict[str, Optional[float]]

get_ner_prf v3.0

计算微观PRF及每个实体的PRF分数。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]

score_coref_clusters 实验性

返回核心指代簇的LEA (Moosavi and Strube, 2016) PRF评分。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
`span_cluster_prefix`	The prefix used for spans representing coreference clusters. str
返回值	包含分数的字典。Dict[str, Optional[float]]

score_span_predictions 实验性

返回从单个标记重建跨度的准确率。只有完全正确的预测才被视为正确，近似答案不计入部分得分。由SpanResolver使用。

名称	描述
`examples`	The `Example` objects holding both the predictions and the correct gold-standard annotations. Iterable[Example]
仅关键字
`output_prefix`	The prefix used for spans representing the final predicted spans. str
返回值	包含分数的字典。Dict[str, Optional[float]]

建议编辑