.. _data-transforms:

数据集转换
----------

scikit-learn 提供了一系列转换器，这些转换器可以清理（参见 :ref:`preprocessing` ）、减少（参见 :ref:`data_reduction` ）、扩展（参见 :ref:`kernel_approximation` ）或生成（参见 :ref:`feature_extraction` ）特征表示。

与其他估计器一样，这些转换器由具有 ``fit`` 方法的类表示，该方法从训练集中学习模型参数（例如，用于标准化的均值和标准差），以及一个 ``transform`` 方法，该方法将此转换模型应用于未见过的数据。对于同时建模和转换训练数据， ``fit_transform`` 可能更加方便和高效。

组合这些转换器，无论是并行还是串行，都在 :ref:`combining_estimators` 中介绍。:ref:`metrics` 涵盖了将特征空间转换为亲和矩阵，而 :ref:`preprocessing_targets` 考虑了目标空间（例如，分类标签）的转换，以便在 scikit-learn 中使用。

.. toctree::
    :maxdepth: 2

    modules/compose
    modules/feature_extraction
    modules/preprocessing
    modules/impute
    modules/unsupervised_reduction
    modules/random_projection
    modules/kernel_approximation
    modules/metrics
    modules/preprocessing_targets