评分器
Scorer用于计算评估分数。它通常由Language.evaluate创建。此外,Scorer还提供多种评估方法来评估Token和Doc属性。
Scorer.__init__ 方法
创建一个新的Scorer。
| 名称 | 描述 |
|---|---|
nlp | The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline is constructed using the default_lang and default_pipeline settings. Optional[Language] |
default_lang | The language to use for a default pipeline if nlp is not provided. Defaults to xx. str |
default_pipeline | The pipeline components to use for a default pipeline if nlp is not provided. Defaults to ("senter", "tagger", "morphologizer", "parser", "ner", "textcat"). Iterable[string] |
| 仅关键字 | |
**kwargs | Any additional settings to pass on to the individual scoring methods. Any |
Scorer.score 方法
使用流水线中各组件提供的评分方法,计算Example对象列表的分数。
返回的Dict包含由各个流水线组件提供的分数。对于由Scorer提供并被核心流水线组件使用的评分方法,各个分数名称以被评分的Token或Doc属性开头:
token_acc,token_p,token_r,token_fsents_p,sents_r,sents_ftag_accpos_accmorph_acc,morph_micro_p,morph_micro_r,morph_micro_f,morph_per_featlemma_accdep_uas,dep_las,dep_las_per_typeents_p,ents_rents_f,ents_per_typespans_sc_p,spans_sc_r,spans_sc_fcats_score(取决于配置,描述见cats_score_desc),cats_micro_p,cats_micro_r,cats_micro_f,cats_macro_p,cats_macro_r,cats_macro_f,cats_macro_auc,cats_f_per_type,cats_auc_per_type
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
| 仅关键字 | |
per_component v3.6 | Whether to return the scores keyed by component name. Defaults to False. bool |
| 返回值 | 包含得分的字典。 Dict[str, Union[float, Dict[str, float]]] |
Scorer.score_tokenization staticmethodv3.0
对分词进行评分:
token_acc: 正确标记数 / 预测标记数token_p,token_r,token_f: 用于标记字符跨度的精确率、召回率和F值
在评分过程中,带有has_unknown_spaces的文档会被跳过。
| 名称 | 描述 |
| ----------- | ------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| examples | 包含预测结果和正确标注数据的Example对象。可迭代[Example] |
| 返回值 | Dict | 包含分数token_acc、token_p、token_r、token_f的字典。字典[str, float]] |
Scorer.score_token_attr staticmethodv3.0
对单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
attr | The attribute to score. str |
| 仅关键字 | |
getter | Defaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any] |
missing_values | Attribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any] |
| RETURNS | A dictionary containing the score {attr}_acc. Dict[str, float] |
Scorer.score_token_attr_per_feat staticmethodv3.0
针对通用依存关系FEATS格式中的词符属性,对每个特征的单个词符属性进行评分。在参考文档中缺失值的词符在评分过程中会被跳过。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
attr | The attribute to score. str |
| 仅关键字 | |
getter | Defaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any] |
missing_values | Attribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any] |
| RETURNS | A dictionary containing the micro PRF scores under the key {attr}_micro_p/r/f and the per-feature PRF scores under {attr}_per_feat. Dict[str, Dict[str, float]] |
Scorer.score_spans staticmethodv3.0
返回已标注或未标注跨度的PRF分数。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
attr | The attribute to score. str |
| 仅关键字 | |
getter | Defaults to getattr. If provided, getter(doc, attr) should return the Span objects for an individual Doc. Callable[[Doc, str], Iterable[Span]] |
has_annotation | Defaults to None. If provided, has_annotation(doc) should return whether a Doc has annotation for this attr. Docs without annotation are skipped for scoring purposes. str |
labeled | Defaults to True. If set to False, two spans will be considered equal if their start and end match, irrespective of their label. bool |
allow_overlap | Defaults to False. Whether or not to allow overlapping spans. If set to False, the alignment will automatically resolve conflicts. bool |
| RETURNS | A dictionary containing the PRF scores under the keys {attr}_p, {attr}_r, {attr}_f and the per-type PRF scores under {attr}_per_type. Dict[str, Union[float, Dict[str, float]]] |
Scorer.score_deps 静态方法v3.0
计算依存句法分析的UAS、LAS及每种类型的LAS得分。在评分过程中,会跳过attr(通常是dep)属性值为空的词元。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
attr | The attribute to score. str |
| 仅关键字 | |
getter | Defaults to getattr. If provided, getter(token, attr) should return the value of the attribute for an individual Token. Callable[[Token, str], Any] |
head_attr | The attribute containing the head token. str |
head_getter | Defaults to getattr. If provided, head_getter(token, attr) should return the head for an individual Token. Callable[[Doc, str],Token] |
ignore_labels | Labels to ignore while scoring (e.g. "punct"). Iterable[str] |
missing_values | Attribute values to treat as missing annotation in the reference annotation. Defaults to {0, None, ""}. Set[Any] |
| RETURNS | A dictionary containing the scores: {attr}_uas, {attr}_las, and {attr}_las_per_type. Dict[str, Union[float, Dict[str, float]]] |
Scorer.score_cats 静态方法v3.0
计算文档级别属性的PRF和ROC AUC分数,该属性是一个包含每个标签分数的字典,例如Doc.cats。返回的字典包含以下分数:
{attr}_micro_p,{attr}_micro_r和{attr}_micro_f: 每个标签下的每个实例具有相同权重{attr}_macro_p,{attr}_macro_r和{attr}_macro_f: 每个标签评估结果的平均值{attr}_f_per_type和{attr}_auc_per_type: 每个都包含一个分数字典,按标签键控- 最终的
{attr}_score和对应的{attr}_score_desc(文本描述)
报告的{attr}_score取决于分类属性:
- 二元排他性正标签:
{attr}_score设置为正标签的F分数 - 3+ 独家课程, 宏观平均F分数:
{attr}_score = {attr}_macro_f - 多标签, 宏平均AUC:
{attr}_score = {attr}_macro_auc
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
attr | The attribute to score. str |
| 仅关键字 | |
getter | Defaults to getattr. If provided, getter(doc, attr) should return the cats for an individual Doc. Callable[[Doc, str], Dict[str, float]] |
| labels | The set of possible labels. Defaults to []. Iterable[str] |
multi_label | Whether the attribute allows multiple labels. Defaults to True. When set to False (exclusive labels), missing gold labels are interpreted as 0.0 and the threshold is set to 0.0. bool |
positive_label | The positive label for a binary task with exclusive classes. Defaults to None. Optional[str] |
threshold | Cutoff to consider a prediction “positive”. Defaults to 0.5 for multi-label, and 0.0 (i.e. whatever’s highest scoring) otherwise. float |
| RETURNS | A dictionary containing the scores, with inapplicable scores as None. Dict[str, Optional[float]] |
Scorer.score_links staticmethodv3.0
返回实体级别上预测链接的PRF(精确率、召回率、F1值)。为了将NEL(实体链接)与NER(实体识别)的性能区分开,该方法仅评估黄金标准与预测结果中重叠实体的NEL链接。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
| 仅关键字 | |
negative_labels | The string values that refer to no annotation (e.g. “NIL”). Iterable[str] |
| 返回值 | 包含分数的字典。Dict[str, Optional[float]] |
get_ner_prf v3.0
计算微观PRF及每个实体的PRF分数。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
score_coref_clusters 实验性
返回核心指代簇的LEA (Moosavi and Strube, 2016) PRF评分。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
| 仅关键字 | |
span_cluster_prefix | The prefix used for spans representing coreference clusters. str |
| 返回值 | 包含分数的字典。Dict[str, Optional[float]] |
score_span_predictions 实验性
返回从单个标记重建跨度的准确率。只有完全正确的预测才被视为正确,近似答案不计入部分得分。由SpanResolver使用。
| 名称 | 描述 |
|---|---|
examples | The Example objects holding both the predictions and the correct gold-standard annotations. Iterable[Example] |
| 仅关键字 | |
output_prefix | The prefix used for spans representing the final predicted spans. str |
| 返回值 | 包含分数的字典。Dict[str, Optional[float]] |