.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/model_selection/plot_roc.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_model_selection_plot_roc.py: ================================================== 多分类接收者操作特性(ROC) ================================================== 本示例描述了使用接收者操作特性(ROC)指标来评估多分类分类器的质量。 ROC 曲线通常在 Y 轴上显示真正率(TPR),在 X 轴上显示假正率(FPR)。这意味着图的左上角是“理想”点——FPR 为零,TPR 为一。这虽然不太现实,但通常曲线下面积(AUC)越大越好。ROC 曲线的“陡峭度”也很重要,因为理想情况下应最大化 TPR,同时最小化 FPR。 ROC 曲线通常用于二分类,其中 TPR 和 FPR 可以明确定义。在多分类的情况下,只有在对输出进行二值化后才能获得 TPR 或 FPR 的概念。这可以通过两种不同的方式完成: - 一对多方案将每个类别与所有其他类别(假设为一个)进行比较; - 一对一方案比较每个类别的唯一成对组合。 在本示例中,我们探讨了这两种方案,并演示了微平均和宏平均的概念,作为总结多分类 ROC 曲线信息的不同方式。 .. note:: 请参阅 :ref:`sphx_glr_auto_examples_model_selection_plot_roc_crossval.py` 以了解本示例的扩展内容,该内容估计了 ROC 曲线及其各自 AUC 的方差。 .. GENERATED FROM PYTHON SOURCE LINES 23-29 加载和准备数据 ===================== 我们导入了包含3个类别的 :ref:`iris_dataset` ,每个类别对应一种鸢尾花。一类与其他两类是线性可分的;而后两类之间 **不是** 线性可分的。 在这里,我们将输出二值化,并添加噪声特征以增加问题的难度。 .. GENERATED FROM PYTHON SOURCE LINES 29-51 .. code-block:: Python import numpy as np from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split iris = load_iris() target_names = iris.target_names X, y = iris.data, iris.target y = iris.target_names[y] random_state = np.random.RandomState(0) n_samples, n_features = X.shape n_classes = len(np.unique(y)) X = np.concatenate([X, random_state.randn(n_samples, 200 * n_features)], axis=1) ( X_train, X_test, y_train, y_test, ) = train_test_split(X, y, test_size=0.5, stratify=y, random_state=0) .. GENERATED FROM PYTHON SOURCE LINES 52-53 我们训练了一个 :class:`~sklearn.linear_model.LogisticRegression` 模型,该模型可以自然地处理多类问题,这要归功于使用了多项式公式。 .. GENERATED FROM PYTHON SOURCE LINES 53-60 .. code-block:: Python from sklearn.linear_model import LogisticRegression classifier = LogisticRegression() y_score = classifier.fit(X_train, y_train).predict_proba(X_test) .. GENERATED FROM PYTHON SOURCE LINES 61-69 一对多多分类ROC ================= 一对多(OvR)多分类策略,也称为一对全,包含为每个 `n_classes` 计算一个ROC曲线。在每一步中,将某个给定的类别视为正类,其余类别则作为负类整体对待。 .. note:: 不应将用于评估多类分类器的OvR策略与通过拟合一组二元分类器(例如通过:class:`~sklearn.multiclass.OneVsRestClassifier` 元估计器)来训练多类分类器的OvR策略混淆。OvR ROC评估可以用来仔细审查任何类型的分类模型,而不管它们是如何训练的(参见:ref:`multiclass` )。 在本节中,我们使用 :class:`~sklearn.preprocessing.LabelBinarizer` 通过一对多(OvR)方式对目标进行单热编码(二值化)。这意味着形状为 ( `n_samples` ,) 的目标将映射为形状为 ( `n_samples` , `n_classes` ) 的目标。 .. GENERATED FROM PYTHON SOURCE LINES 69-76 .. code-block:: Python from sklearn.preprocessing import LabelBinarizer label_binarizer = LabelBinarizer().fit(y_train) y_onehot_test = label_binarizer.transform(y_test) y_onehot_test.shape # (n_samples, n_classes) .. rst-class:: sphx-glr-script-out .. code-block:: none (75, 3) .. GENERATED FROM PYTHON SOURCE LINES 77-78 我们也可以轻松检查特定类的编码: .. GENERATED FROM PYTHON SOURCE LINES 78-82 .. code-block:: Python label_binarizer.transform(["virginica"]) .. rst-class:: sphx-glr-script-out .. code-block:: none array([[0, 0, 1]]) .. GENERATED FROM PYTHON SOURCE LINES 83-87 显示特定类别的ROC曲线 ---------------------------------- 在下图中,我们展示了将鸢尾花视为“virginica”( `class_id=2` )或“非virginica”(其余)的情况下得到的ROC曲线。 .. GENERATED FROM PYTHON SOURCE LINES 87-92 .. code-block:: Python class_of_interest = "virginica" class_id = np.flatnonzero(label_binarizer.classes_ == class_of_interest)[0] class_id .. rst-class:: sphx-glr-script-out .. code-block:: none np.int64(2) .. GENERATED FROM PYTHON SOURCE LINES 93-110 .. code-block:: Python import matplotlib.pyplot as plt from sklearn.metrics import RocCurveDisplay display = RocCurveDisplay.from_predictions( y_onehot_test[:, class_id], y_score[:, class_id], name=f"{class_of_interest} vs the rest", color="darkorange", plot_chance_level=True, ) _ = display.ax_.set( xlabel="False Positive Rate", ylabel="True Positive Rate", title="One-vs-Rest ROC curves:\nVirginica vs (Setosa & Versicolor)", ) .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_001.png :alt: One-vs-Rest ROC curves: Virginica vs (Setosa & Versicolor) :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 111-121 微平均OvR的ROC曲线 ------------------- 微平均通过聚合所有类别的贡献(使用 :func:`numpy.ravel` )来计算平均指标,如下所示: :math:`TPR=\frac{\sum_{c}TP_c}{\sum_{c}(TP_c + FN_c)}` ; :math:`FPR=\frac{\sum_{c}FP_c}{\sum_{c}(FP_c + TN_c)}` . 我们可以简要演示 :func:`numpy.ravel` 的效果: .. GENERATED FROM PYTHON SOURCE LINES 121-126 .. code-block:: Python print(f"y_score:\n{y_score[0:2,:]}") print() print(f"y_score.ravel():\n{y_score[0:2,:].ravel()}") .. rst-class:: sphx-glr-script-out .. code-block:: none y_score: [[0.38095776 0.05072909 0.56831315] [0.07031555 0.27915668 0.65052777]] y_score.ravel(): [0.38095776 0.05072909 0.56831315 0.07031555 0.27915668 0.65052777] .. GENERATED FROM PYTHON SOURCE LINES 127-128 在类别高度不平衡的多分类设置中,微平均优于宏平均。在这种情况下,可以选择使用加权宏平均,这里未演示。 .. GENERATED FROM PYTHON SOURCE LINES 128-143 .. code-block:: Python display = RocCurveDisplay.from_predictions( y_onehot_test.ravel(), y_score.ravel(), name="micro-average OvR", color="darkorange", plot_chance_level=True, ) _ = display.ax_.set( xlabel="False Positive Rate", ylabel="True Positive Rate", title="Micro-averaged One-vs-Rest\nReceiver Operating Characteristic", ) .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_002.png :alt: Micro-averaged One-vs-Rest Receiver Operating Characteristic :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 144-145 如果主要关注的不是图表而是ROC-AUC分数本身,我们可以使用:class:`~sklearn.metrics.roc_auc_score` 再现图表中显示的值。 .. GENERATED FROM PYTHON SOURCE LINES 145-158 .. code-block:: Python from sklearn.metrics import roc_auc_score micro_roc_auc_ovr = roc_auc_score( y_test, y_score, multi_class="ovr", average="micro", ) print(f"Micro-averaged One-vs-Rest ROC AUC score:\n{micro_roc_auc_ovr:.2f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Micro-averaged One-vs-Rest ROC AUC score: 0.77 .. GENERATED FROM PYTHON SOURCE LINES 159-160 这相当于使用 :class:`~sklearn.metrics.roc_curve` 计算 ROC 曲线,然后使用 :class:`~sklearn.metrics.auc` 计算展开的真实类和预测类的曲线下面积。 .. GENERATED FROM PYTHON SOURCE LINES 160-172 .. code-block:: Python from sklearn.metrics import auc, roc_curve # 存储所有平均策略的fpr、tpr和roc_auc fpr, tpr, roc_auc = dict(), dict(), dict() # 计算微平均ROC曲线和ROC面积 fpr["micro"], tpr["micro"], _ = roc_curve(y_onehot_test.ravel(), y_score.ravel()) roc_auc["micro"] = auc(fpr["micro"], tpr["micro"]) print(f"Micro-averaged One-vs-Rest ROC AUC score:\n{roc_auc['micro']:.2f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Micro-averaged One-vs-Rest ROC AUC score: 0.77 .. GENERATED FROM PYTHON SOURCE LINES 173-178 .. NOTE:: 默认情况下,ROC 曲线的计算通过使用线性插值和 McClish 校正 [:doi:`Analyzing a portion of the ROC curve Med Decis Making. 1989 Jul-Sep; 9(3):190-5.<10.1177/0272989x8900900307>` ] 在最大假阳性率处添加一个点。 使用OvR宏平均的ROC曲线 要获得宏平均值,需要独立计算每个类别的度量值,然后对它们取平均值,从而在先验上平等对待所有类别。我们首先汇总每个类别的真/假阳性率: .. GENERATED FROM PYTHON SOURCE LINES 178-200 .. code-block:: Python for i in range(n_classes): fpr[i], tpr[i], _ = roc_curve(y_onehot_test[:, i], y_score[:, i]) roc_auc[i] = auc(fpr[i], tpr[i]) fpr_grid = np.linspace(0.0, 1.0, 1000) # 在这些点插值所有的ROC曲线 mean_tpr = np.zeros_like(fpr_grid) for i in range(n_classes): mean_tpr += np.interp(fpr_grid, fpr[i], tpr[i]) # linear interpolation # 平均并计算AUC mean_tpr /= n_classes fpr["macro"] = fpr_grid tpr["macro"] = mean_tpr roc_auc["macro"] = auc(fpr["macro"], tpr["macro"]) print(f"Macro-averaged One-vs-Rest ROC AUC score:\n{roc_auc['macro']:.2f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Macro-averaged One-vs-Rest ROC AUC score: 0.78 .. GENERATED FROM PYTHON SOURCE LINES 201-202 此计算等同于简单地调用 .. GENERATED FROM PYTHON SOURCE LINES 202-213 .. code-block:: Python macro_roc_auc_ovr = roc_auc_score( y_test, y_score, multi_class="ovr", average="macro", ) print(f"Macro-averaged One-vs-Rest ROC AUC score:\n{macro_roc_auc_ovr:.2f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Macro-averaged One-vs-Rest ROC AUC score: 0.78 .. GENERATED FROM PYTHON SOURCE LINES 214-216 绘制所有 OvR ROC 曲线 ------------------------ .. GENERATED FROM PYTHON SOURCE LINES 216-256 .. code-block:: Python from itertools import cycle fig, ax = plt.subplots(figsize=(6, 6)) plt.plot( fpr["micro"], tpr["micro"], label=f"micro-average ROC curve (AUC = {roc_auc['micro']:.2f})", color="deeppink", linestyle=":", linewidth=4, ) plt.plot( fpr["macro"], tpr["macro"], label=f"macro-average ROC curve (AUC = {roc_auc['macro']:.2f})", color="navy", linestyle=":", linewidth=4, ) colors = cycle(["aqua", "darkorange", "cornflowerblue"]) for class_id, color in zip(range(n_classes), colors): RocCurveDisplay.from_predictions( y_onehot_test[:, class_id], y_score[:, class_id], name=f"ROC curve for {target_names[class_id]}", color=color, ax=ax, plot_chance_level=(class_id == 2), ) _ = ax.set( xlabel="False Positive Rate", ylabel="True Positive Rate", title="Extension of Receiver Operating Characteristic\nto One-vs-Rest multiclass", ) .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_003.png :alt: Extension of Receiver Operating Characteristic to One-vs-Rest multiclass :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 257-267 一对一多分类ROC ================= 一对一(OvO)多分类策略包括为每对类别拟合一个分类器。由于需要训练 `n_classes` * ( `n_classes` - 1) / 2 个分类器,因此由于其 O( `n_classes` ^2) 的复杂度,这种方法通常比一对多更慢。 在本节中,我们使用OvO方案展示了3种可能组合的宏平均AUC在 :ref:`iris_dataset` 中的应用:"setosa" vs "versicolor"、"versicolor" vs "virginica" 和 "virginica" vs "setosa"。请注意,微平均不适用于OvO方案。 使用OvO宏平均的ROC曲线 在OvO方案中,第一步是识别所有可能的唯一对组合。分数的计算是通过将给定对中的一个元素视为正类,另一个元素视为负类来完成的,然后通过反转角色重新计算分数,并取两个分数的平均值。 .. GENERATED FROM PYTHON SOURCE LINES 267-273 .. code-block:: Python from itertools import combinations pair_list = list(combinations(np.unique(y), 2)) print(pair_list) .. rst-class:: sphx-glr-script-out .. code-block:: none [(np.str_('setosa'), np.str_('versicolor')), (np.str_('setosa'), np.str_('virginica')), (np.str_('versicolor'), np.str_('virginica'))] .. GENERATED FROM PYTHON SOURCE LINES 274-327 .. code-block:: Python pair_scores = [] mean_tpr = dict() for ix, (label_a, label_b) in enumerate(pair_list): a_mask = y_test == label_a b_mask = y_test == label_b ab_mask = np.logical_or(a_mask, b_mask) a_true = a_mask[ab_mask] b_true = b_mask[ab_mask] idx_a = np.flatnonzero(label_binarizer.classes_ == label_a)[0] idx_b = np.flatnonzero(label_binarizer.classes_ == label_b)[0] fpr_a, tpr_a, _ = roc_curve(a_true, y_score[ab_mask, idx_a]) fpr_b, tpr_b, _ = roc_curve(b_true, y_score[ab_mask, idx_b]) mean_tpr[ix] = np.zeros_like(fpr_grid) mean_tpr[ix] += np.interp(fpr_grid, fpr_a, tpr_a) mean_tpr[ix] += np.interp(fpr_grid, fpr_b, tpr_b) mean_tpr[ix] /= 2 mean_score = auc(fpr_grid, mean_tpr[ix]) pair_scores.append(mean_score) fig, ax = plt.subplots(figsize=(6, 6)) plt.plot( fpr_grid, mean_tpr[ix], label=f"Mean {label_a} vs {label_b} (AUC = {mean_score :.2f})", linestyle=":", linewidth=4, ) RocCurveDisplay.from_predictions( a_true, y_score[ab_mask, idx_a], ax=ax, name=f"{label_a} as positive class", ) RocCurveDisplay.from_predictions( b_true, y_score[ab_mask, idx_b], ax=ax, name=f"{label_b} as positive class", plot_chance_level=True, ) ax.set( xlabel="False Positive Rate", ylabel="True Positive Rate", title=f"{target_names[idx_a]} vs {label_b} ROC curves", ) print(f"Macro-averaged One-vs-One ROC AUC score:\n{np.average(pair_scores):.2f}") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_004.png :alt: setosa vs versicolor ROC curves :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_004.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_005.png :alt: setosa vs virginica ROC curves :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_005.png :class: sphx-glr-multi-img * .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_006.png :alt: versicolor vs virginica ROC curves :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_006.png :class: sphx-glr-multi-img .. rst-class:: sphx-glr-script-out .. code-block:: none Macro-averaged One-vs-One ROC AUC score: 0.78 .. GENERATED FROM PYTHON SOURCE LINES 328-329 我们还可以断言,我们“手动”计算的宏平均值等同于实现的 `average="macro"` 选项在 :class:`~sklearn.metrics.roc_auc_score` 函数中的结果。 .. GENERATED FROM PYTHON SOURCE LINES 329-340 .. code-block:: Python macro_roc_auc_ovo = roc_auc_score( y_test, y_score, multi_class="ovo", average="macro", ) print(f"Macro-averaged One-vs-One ROC AUC score:\n{macro_roc_auc_ovo:.2f}") .. rst-class:: sphx-glr-script-out .. code-block:: none Macro-averaged One-vs-One ROC AUC score: 0.78 .. GENERATED FROM PYTHON SOURCE LINES 341-343 绘制所有OvO ROC曲线 ---------------------- .. GENERATED FROM PYTHON SOURCE LINES 343-374 .. code-block:: Python ovo_tpr = np.zeros_like(fpr_grid) fig, ax = plt.subplots(figsize=(6, 6)) for ix, (label_a, label_b) in enumerate(pair_list): ovo_tpr += mean_tpr[ix] ax.plot( fpr_grid, mean_tpr[ix], label=f"Mean {label_a} vs {label_b} (AUC = {pair_scores[ix]:.2f})", ) ovo_tpr /= sum(1 for pair in enumerate(pair_list)) ax.plot( fpr_grid, ovo_tpr, label=f"One-vs-One macro-average (AUC = {macro_roc_auc_ovo:.2f})", linestyle=":", linewidth=4, ) ax.plot([0, 1], [0, 1], "k--", label="Chance level (AUC = 0.5)") _ = ax.set( xlabel="False Positive Rate", ylabel="True Positive Rate", title="Extension of Receiver Operating Characteristic\nto One-vs-One multiclass", aspect="equal", xlim=(-0.01, 1.01), ylim=(-0.01, 1.01), ) .. image-sg:: /auto_examples/model_selection/images/sphx_glr_plot_roc_007.png :alt: Extension of Receiver Operating Characteristic to One-vs-One multiclass :srcset: /auto_examples/model_selection/images/sphx_glr_plot_roc_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 375-380 我们确认“versicolor”和“virginica”类不能被线性分类器很好地识别。注意,“virginica”对其余类的ROC-AUC得分(0.77)介于“versicolor”对“virginica”(0.64)和“setosa”对“virginica”(0.90)的OvO ROC-AUC得分之间。实际上,OvO策略在类数较多时以计算成本为代价,提供了关于类对混淆的额外信息。 如果用户主要关注正确识别特定类别或类别子集,建议使用OvO策略,而分类器的整体性能评估仍然可以通过给定的平均策略来总结。 微平均OvR ROC受更频繁类别的主导,因为计数是合并的。宏平均替代方法更好地反映了不太频繁类别的统计数据,因此在所有类别的性能被认为同等重要时更为合适。 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.298 seconds) .. _sphx_glr_download_auto_examples_model_selection_plot_roc.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/model_selection/plot_roc.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_roc.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_roc.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_roc.zip ` .. include:: plot_roc.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_