Skip to content

获取配置空间

该文件是TPOT库的一部分。

当前版本的TPOT是由以下人员在Cedars-Sinai开发的: - Pedro Henrique Ribeiro (https://github.com/perib, https://www.linkedin.com/in/pedro-ribeiro/) - Anil Saini (anil.saini@cshs.org) - Jose Hernandez (jgh9094@gmail.com) - Jay Moran (jay.moran@cshs.org) - Nicholas Matsumoto (nicholas.matsumoto@cshs.org) - Hyunjun Choi (hyunjun.choi@cshs.org) - Miguel E. Hernandez (miguel.e.hernandez@cshs.org) - Jason Moore (moorejh28@gmail.com)

TPOT的原始版本主要由宾夕法尼亚大学的以下人员开发: - Randal S. Olson (rso@randalolson.com) - Weixuan Fu (weixuanf@upenn.edu) - Daniel Angell (dpa34@drexel.edu) - Jason Moore (moorejh28@gmail.com) - 以及许多慷慨的开源贡献者

TPOT 是免费软件:您可以根据自由软件基金会发布的 GNU 宽通用公共许可证的条款重新分发和/或修改它,许可证的版本可以是第 3 版,或者(根据您的选择)任何以后的版本。

TPOT 的发布是希望它能有用, 但没有任何保证;甚至没有对 适销性或特定用途适用性的暗示保证。更多详情请参阅 GNU 较宽松通用公共许可证。

您应该已经收到了一份GNU较宽松通用公共许可证的副本,随TPOT一起提供。如果没有,请参见http://www.gnu.org/licenses/

get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, n_jobs=1)

此函数返回ConfigSpace.ConfigurationSpace,其中包含给定scikit-learn方法的超参数范围。它还使用n_classes、n_samples、n_features和random_state来设置依赖于这些值的超参数。

参数:

名称 类型 描述 默认值
name str

用于创建ConfigurationSpace的scikit-learn方法的字符串名称。(例如,'RandomForestClassifier' 对应 sklearn.ensemble.RandomForestClassifier)

required
n_classes int

目标变量中的类别数量。默认值为3。

3
n_samples int

数据集中的样本数量。默认值为1000。

1000
n_features int

数据集中的特征数量。默认值为100。

100
random_state int

在ConfigurationSpace中使用的random_state。默认值为None。 如果为None,则ConfigurationSpace中不包含random_state超参数。 如果您希望确保可重复性,请使用此选项为各个方法设置随机状态。

None
n_jobs int(default=1)

设置具有该参数的估计器的n_jobs参数。默认值为1。

1
Source code in tpot2/config/get_configspace.py
def get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, n_jobs=1):
    """
    This function returns the ConfigSpace.ConfigurationSpace with the hyperparameter ranges for the given
    scikit-learn method. It also uses the n_classes, n_samples, n_features, and random_state to set the
    hyperparameters that depend on these values.

    Parameters
    ----------
    name : str
        The str name of the scikit-learn method for which to create the ConfigurationSpace. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
    n_classes : int
        The number of classes in the target variable. Default is 3.
    n_samples : int
        The number of samples in the dataset. Default is 1000.
    n_features : int
        The number of features in the dataset. Default is 100.
    random_state : int
        The random_state to use in the ConfigurationSpace. Default is None.
        If None, the random_state hyperparameter is not included in the ConfigurationSpace.
        Use this to set the random state for the individual methods if you want to ensure reproducibility.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    """
    match name:

        #autoqtl_builtins.py
        case "FeatureEncodingFrequencySelector":
            return autoqtl_builtins.FeatureEncodingFrequencySelector_ConfigurationSpace
        case "DominantEncoder":
            return {}
        case "RecessiveEncoder":
            return {}
        case "HeterosisEncoder":
            return {}
        case "UnderDominanceEncoder":
            return {}
        case "OverDominanceEncoder":
            return {}

        case "Passthrough":
            return {}
        case "SkipTransformer":
            return {}

        #classifiers.py
        case "LinearDiscriminantAnalysis":
            return classifiers.get_LinearDiscriminantAnalysis_ConfigurationSpace()
        case "AdaBoostClassifier":
            return classifiers.get_AdaBoostClassifier_ConfigurationSpace(random_state=random_state)
        case "LogisticRegression":
            return classifiers.get_LogisticRegression_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "KNeighborsClassifier":
            return classifiers.get_KNeighborsClassifier_ConfigurationSpace(n_samples=n_samples, n_jobs=n_jobs)
        case "DecisionTreeClassifier":
            return classifiers.get_DecisionTreeClassifier_ConfigurationSpace(n_featues=n_features, random_state=random_state)
        case "SVC":
            return classifiers.get_SVC_ConfigurationSpace(random_state=random_state)
        case "LinearSVC":
            return classifiers.get_LinearSVC_ConfigurationSpace(random_state=random_state)
        case "RandomForestClassifier":
            return classifiers.get_RandomForestClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "GradientBoostingClassifier":
            return classifiers.get_GradientBoostingClassifier_ConfigurationSpace(n_classes=n_classes, random_state=random_state)
        case "HistGradientBoostingClassifier":
            return classifiers.get_HistGradientBoostingClassifier_ConfigurationSpace(random_state=random_state)
        case "XGBClassifier":
            return classifiers.get_XGBClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "LGBMClassifier":
            return classifiers.get_LGBMClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "ExtraTreesClassifier":
            return classifiers.get_ExtraTreesClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "SGDClassifier":
            return classifiers.get_SGDClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "MLPClassifier":
            return classifiers.get_MLPClassifier_ConfigurationSpace(random_state=random_state)
        case "BernoulliNB":
            return classifiers.get_BernoulliNB_ConfigurationSpace()
        case "MultinomialNB":
            return classifiers.get_MultinomialNB_ConfigurationSpace()
        case "GaussianNB":
            return {}
        case "LassoLarsCV":
            return {}
        case "ElasticNetCV":
            return regressors.ElasticNetCV_configspace
        case "RidgeCV":
            return {}
        case "PassiveAggressiveClassifier":
            return classifiers.get_PassiveAggressiveClassifier_ConfigurationSpace(random_state=random_state)
        case "QuadraticDiscriminantAnalysis":
            return classifiers.get_QuadraticDiscriminantAnalysis_ConfigurationSpace()
        case "GaussianProcessClassifier":
            return classifiers.get_GaussianProcessClassifier_ConfigurationSpace(n_features=n_features, random_state=random_state)
        case "BaggingClassifier":
            return classifiers.get_BaggingClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)

        #regressors.py
        case "RandomForestRegressor":
            return regressors.get_RandomForestRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "SGDRegressor":
            return regressors.get_SGDRegressor_ConfigurationSpace(random_state=random_state)
        case "Ridge":
            return regressors.get_Ridge_ConfigurationSpace(random_state=random_state)
        case "Lasso":
            return regressors.get_Lasso_ConfigurationSpace(random_state=random_state)
        case "ElasticNet":
            return regressors.get_ElasticNet_ConfigurationSpace(random_state=random_state)
        case "Lars":
            return regressors.get_Lars_ConfigurationSpace(random_state=random_state)
        case "OthogonalMatchingPursuit":
            return regressors.get_OthogonalMatchingPursuit_ConfigurationSpace()
        case "BayesianRidge":
            return regressors.get_BayesianRidge_ConfigurationSpace()
        case "LassoLars":
            return regressors.get_LassoLars_ConfigurationSpace(random_state=random_state)
        case "BaggingRegressor":
            return regressors.get_BaggingRegressor_ConfigurationSpace(random_state=random_state)
        case "ARDRegression":
            return regressors.get_ARDRegression_ConfigurationSpace()
        case "TheilSenRegressor":
            return regressors.get_TheilSenRegressor_ConfigurationSpace(random_state=random_state)
        case "Perceptron":
            return regressors.get_Perceptron_ConfigurationSpace(random_state=random_state)
        case "DecisionTreeRegressor":
            return regressors.get_DecisionTreeRegressor_ConfigurationSpace(random_state=random_state)
        case "LinearSVR":
            return regressors.get_LinearSVR_ConfigurationSpace(random_state=random_state)
        case "SVR":
            return regressors.get_SVR_ConfigurationSpace()
        case "XGBRegressor":
            return regressors.get_XGBRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "AdaBoostRegressor":
            return regressors.get_AdaBoostRegressor_ConfigurationSpace(random_state=random_state)
        case "ExtraTreesRegressor":
            return regressors.get_ExtraTreesRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "GradientBoostingRegressor":
            return regressors.get_GradientBoostingRegressor_ConfigurationSpace(random_state=random_state)
        case "HistGradientBoostingRegressor":
            return regressors.get_HistGradientBoostingRegressor_ConfigurationSpace(random_state=random_state)
        case "MLPRegressor":
            return regressors.get_MLPRegressor_ConfigurationSpace(random_state=random_state)
        case "KNeighborsRegressor":
            return regressors.get_KNeighborsRegressor_ConfigurationSpace(n_samples=n_samples, n_jobs=n_jobs)
        case "GaussianProcessRegressor":
            return regressors.get_GaussianProcessRegressor_ConfigurationSpace(n_features=n_features, random_state=random_state)
        case "LGBMRegressor":
            return regressors.get_LGBMRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "BaggingRegressor":
            return regressors.get_BaggingRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)

        #transformers.py
        case "Binarizer":
            return transformers.Binarizer_configspace
        case "Normalizer":
            return transformers.Normalizer_configspace
        case "PCA":
            return transformers.PCA_configspace
        case "ZeroCount":
            return transformers.ZeroCount_configspace
        case "FastICA":
            return transformers.get_FastICA_configspace(n_features=n_features, random_state=random_state)
        case "FeatureAgglomeration":
            return transformers.get_FeatureAgglomeration_configspace(n_features=n_features)
        case "Nystroem":
            return transformers.get_Nystroem_configspace(n_features=n_features, random_state=random_state)
        case "RBFSampler":
            return transformers.get_RBFSampler_configspace(n_features=n_features, random_state=random_state)
        case "MinMaxScaler":
            return {}
        case "PowerTransformer":
            return {}
        case "QuantileTransformer":
            return transformers.get_QuantileTransformer_configspace(n_samples=n_samples, random_state=random_state)
        case "RobustScaler":
            return transformers.RobustScaler_configspace
        case "ColumnOneHotEncoder":
            return {}
        case "MaxAbsScaler":
            return {}
        case "PolynomialFeatures":
            return transformers.PolynomialFeatures_configspace
        case "StandardScaler":
            return {}
        case "PassKBinsDiscretizer":
            return transformers.get_passkbinsdiscretizer_configspace(random_state=random_state)
        case "KBinsDiscretizer":
            return transformers.get_passkbinsdiscretizer_configspace(random_state=random_state)

        #selectors.py
        case "SelectFwe":
            return selectors.SelectFwe_configspace 
        case "SelectPercentile":
            return selectors.SelectPercentile_configspace
        case "VarianceThreshold":
            return selectors.VarianceThreshold_configspace
        case "RFE":
            return selectors.RFE_configspace_part
        case "SelectFromModel":
            return selectors.SelectFromModel_configspace_part


        #special_configs.py
        case "AddTransformer":
            return {}
        case "mul_neg_1_Transformer":
            return {}
        case "MulTransformer":
            return {}
        case "SafeReciprocalTransformer":
            return {}
        case "EQTransformer":
            return {}
        case "NETransformer":
            return {}
        case "GETransformer":
            return {}
        case "GTTransformer":
            return {}
        case "LETransformer":
            return {}
        case "LTTransformer":
            return {}        
        case "MinTransformer":
            return {}
        case "MaxTransformer":
            return {}
        case "ZeroTransformer":
            return {}
        case "OneTransformer":
            return {}
        case "NTransformer":
            return ConfigurationSpace(

                space = {

                    'n': Float("n", bounds=(-1e2, 1e2)),
                }
            ) 

        #imputers.py
        case "SimpleImputer":
            return imputers.simple_imputer_cs
        case "IterativeImputer":
            return imputers.get_IterativeImputer_config_space(n_features=n_features, random_state=random_state)
        case "IterativeImputer_no_estimator":
            return imputers.get_IterativeImputer_config_space_no_estimator(n_features=n_features, random_state=random_state)

        case "KNNImputer":
            return imputers.get_KNNImputer_config_space(n_samples=n_samples)

        #mdr_configs.py
        case "MDR":
            return mdr_configs.MDR_configspace
        case "ContinuousMDR":
            return mdr_configs.MDR_configspace
        case "ReliefF":
            return mdr_configs.get_skrebate_ReliefF_config_space(n_features=n_features)
        case "SURF":
            return mdr_configs.get_skrebate_SURF_config_space(n_features=n_features)
        case "SURFstar":
            return mdr_configs.get_skrebate_SURFstar_config_space(n_features=n_features)
        case "MultiSURF":
            return mdr_configs.get_skrebate_MultiSURF_config_space(n_features=n_features)

        #classifiers_sklearnex.py
        case "RandomForestClassifier_sklearnex":
            return classifiers_sklearnex.get_RandomForestClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "LogisticRegression_sklearnex":
            return classifiers_sklearnex.get_LogisticRegression_ConfigurationSpace(random_state=random_state)
        case "KNeighborsClassifier_sklearnex":
            return classifiers_sklearnex.get_KNeighborsClassifier_ConfigurationSpace(n_samples=n_samples)
        case "SVC_sklearnex":
            return classifiers_sklearnex.get_SVC_ConfigurationSpace(random_state=random_state)
        case "NuSVC_sklearnex":
            return classifiers_sklearnex.get_NuSVC_ConfigurationSpace(random_state=random_state)

        #regressors_sklearnex.py
        case "LinearRegression_sklearnex":
            return {}
        case "Ridge_sklearnex":
            return regressors_sklearnex.get_Ridge_ConfigurationSpace(random_state=random_state)
        case "Lasso_sklearnex":
            return regressors_sklearnex.get_Lasso_ConfigurationSpace(random_state=random_state)
        case "ElasticNet_sklearnex":
            return regressors_sklearnex.get_ElasticNet_ConfigurationSpace(random_state=random_state)
        case "SVR_sklearnex":
            return regressors_sklearnex.get_SVR_ConfigurationSpace(random_state=random_state)
        case "NuSVR_sklearnex":
            return regressors_sklearnex.get_NuSVR_ConfigurationSpace(random_state=random_state)
        case "RandomForestRegressor_sklearnex":
            return regressors_sklearnex.get_RandomForestRegressor_ConfigurationSpace(random_state=random_state)
        case "KNeighborsRegressor_sklearnex":
            return regressors_sklearnex.get_KNeighborsRegressor_ConfigurationSpace(n_samples=n_samples)

    #raise error
    raise ValueError(f"Could not find configspace for {name}")

get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None, base_node=EstimatorNode, n_jobs=1)

用于get_search_space的辅助函数。返回给定scikit-learn方法的单个EstimatorNode。还包括需要自定义解析超参数或包装其他方法的节点的特殊情况。

参数:

名称 类型 描述 默认值
name str or list

要为其创建搜索空间的scikit-learn方法或方法组的名称。 - str: scikit-learn方法的名称。(例如,'RandomForestClassifier' 对应 sklearn.ensemble.RandomForestClassifier) 或者,方法组的名称。(例如,'classifiers' 对应所有分类器)。 - list: scikit-learn方法名称的列表。(例如,['RandomForestClassifier', 'ExtraTreesClassifier'])

required
n_classes int(default=3)

目标变量中的类别数量。

3
n_samples int(default=1000)

数据集中的样本数量。

100
n_features int(default=100)

数据集中的特征数量。

100
random_state int(default=None)

一个固定的random_state,传递给所有具有random_state超参数的方法。

None
return_choice_pipeline bool(default=True)

如果为False,返回一个TPOT2.search_spaces.nodes.EstimatorNode对象的列表。 如果为True,返回一个包含所有EstimatorNode并从中采样的单个TPOT2.search_spaces.pipelines.ChoicePipeline。

required
base_node

用于传递配置空间的SearchSpace。如果你想尝试自定义的变异/交叉操作符,可以在这里传递一个自定义的SearchSpace节点。

EstimatorNode
n_jobs int(default=1)

设置具有该参数的估计器的n_jobs参数。默认值为1。

1

返回:

类型 描述
返回一个可以由TPOT优化的SearchSpace对象。
  • TPOT2.search_spaces.nodes.EstimatorNode(或base_node)。
  • 如果方法需要一个包装的估计器,则为TPOT2.search_spaces.pipelines.WrapperPipeline对象。
Source code in tpot2/config/get_configspace.py
def get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None, base_node=EstimatorNode, n_jobs=1):
    """
    Helper function for get_search_space. Returns a single EstimatorNode for the given scikit-learn method. Also includes special cases for nodes that require custom parsing of the hyperparameters or methods that wrap other methods.

    Parameters
    ----------

    name : str or list
        The name of the scikit-learn method or group of methods for which to create the search space.
        - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
        Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers).
        - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])
    n_classes : int (default=3)
        The number of classes in the target variable.
    n_samples : int (default=1000)
        The number of samples in the dataset.
    n_features : int (default=100)
        The number of features in the dataset.
    random_state : int (default=None)
        A fixed random_state to pass through to all methods that have a random_state hyperparameter. 
    return_choice_pipeline : bool (default=True)
        If False, returns a list of TPOT2.search_spaces.nodes.EstimatorNode objects.
        If True, returns a single TPOT2.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.
    base_node: TPOT2.search_spaces.base.SearchSpace (default=TPOT2.search_spaces.nodes.EstimatorNode)
        The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    Returns
    -------
        Returns an SearchSpace object that can be optimized by TPOT.
        - TPOT2.search_spaces.nodes.EstimatorNode (or base_node).
        - TPOT2.search_spaces.pipelines.WrapperPipeline object if the method requires a wrapped estimator.


    """

    if name == "LinearSVC_wrapped":
        ext = get_node("LinearSVC", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=sklearn.calibration.CalibratedClassifierCV, space={})
    if name == "RFE_classification":
        rfe_sp = get_configspace(name="RFE", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesClassifier", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=RFE, space=rfe_sp)
    if name == "RFE_regression":
        rfe_sp = get_configspace(name="RFE", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=RFE, space=rfe_sp)
    if name == "SelectFromModel_classification":
        sfm_sp = get_configspace(name="SelectFromModel", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesClassifier", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=SelectFromModel, space=sfm_sp)
    if name == "SelectFromModel_regression":
        sfm_sp = get_configspace(name="SelectFromModel", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=SelectFromModel, space=sfm_sp)
    # TODO Add IterativeImputer with more estimator methods
    if name == "IterativeImputer_learned_estimators":
        iteative_sp = get_configspace(name="IterativeImputer_no_estimator", n_features=n_features, random_state=random_state, n_jobs=n_jobs)
        regressor_searchspace = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=regressor_searchspace, method=IterativeImputer, space=iteative_sp)

    #these are nodes that have special search spaces which require custom parsing of the hyperparameters
    if name == "IterativeImputer":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return EstimatorNode(STRING_TO_CLASS[name], configspace, hyperparameter_parser=imputers.IterativeImputer_hyperparameter_parser)
    if name == "RobustScaler":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=transformers.robust_scaler_hyperparameter_parser)
    if name == "GradientBoostingClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.GradientBoostingClassifier_hyperparameter_parser)
    if name == "HistGradientBoostingClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.HistGradientBoostingClassifier_hyperparameter_parser)
    if name == "GradientBoostingRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.GradientBoostingRegressor_hyperparameter_parser)
    if  name == "HistGradientBoostingRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.HistGradientBoostingRegressor_hyperparameter_parser)
    if name == "MLPClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.MLPClassifier_hyperparameter_parser)
    if name == "MLPRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.MLPRegressor_hyperparameter_parser)
    if name == "GaussianProcessRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.GaussianProcessRegressor_hyperparameter_parser)
    if name == "GaussianProcessClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.GaussianProcessClassifier_hyperparameter_parser)
    if name == "FeatureAgglomeration":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=transformers.FeatureAgglomeration_hyperparameter_parser)

    configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, n_jobs=n_jobs)
    if configspace is None:
        #raise warning
        warnings.warn(f"Could not find configspace for {name}")
        return None

    return base_node(STRING_TO_CLASS[name], configspace)

get_search_space(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, return_choice_pipeline=True, base_node=EstimatorNode, n_jobs=1)

返回给定scikit-learn方法或方法组的TPOT搜索空间。

参数:

名称 类型 描述 默认值
name str or list

要为其创建搜索空间的scikit-learn方法或方法组的名称。 - str: scikit-learn方法的名称。(例如,'RandomForestClassifier' 对应 sklearn.ensemble.RandomForestClassifier) 或者,方法组的名称。(例如,'classifiers' 对应所有分类器)。 - list: scikit-learn方法名称的列表。(例如,['RandomForestClassifier', 'ExtraTreesClassifier'])

required
n_classes int(default=3)

目标变量中的类别数量。

3
n_samples int(default=1000)

数据集中的样本数量。

1000
n_features int(default=100)

数据集中的特征数量。

100
random_state int(default=None)

一个固定的random_state,传递给所有具有random_state超参数的方法。

None
return_choice_pipeline bool(default=True)

如果为False,返回一个TPOT2.search_spaces.nodes.EstimatorNode对象的列表。 如果为True,返回一个包含所有EstimatorNode并从中采样的TPOT2.search_spaces.pipelines.ChoicePipeline。

True
base_node

用于传递配置空间的SearchSpace。如果你想尝试自定义的变异/交叉操作符,可以在这里传递一个自定义的SearchSpace节点。

EstimatorNode
n_jobs int(default=1)

设置具有该参数的估计器的n_jobs参数。默认值为1。

1

返回:

类型 描述
返回一个可以由TPOT优化的SearchSpace对象。
  • 如果只有一个搜索空间,则返回TPOT2.search_spaces.nodes.EstimatorNode(或base_node)。
  • 如果有多个搜索空间,则返回TPOT2.search_spaces.nodes.EstimatorNode(或base_node)对象的列表。
  • 如果return_choice_pipeline为True,则返回TPOT2.search_spaces.pipelines.ChoicePipeline对象。 注意:对于某些使用包装估计器的特殊方法,返回的搜索空间是TPOT2.search_spaces.pipelines.WrapperPipeline对象。
Source code in tpot2/config/get_configspace.py
def get_search_space(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, return_choice_pipeline=True, base_node=EstimatorNode, n_jobs=1):
    """
    Returns a TPOT search space for a given scikit-learn method or group of methods.

    Parameters
    ----------
    name : str or list
        The name of the scikit-learn method or group of methods for which to create the search space.
        - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
        Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers).
        - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])
    n_classes : int (default=3)
        The number of classes in the target variable.
    n_samples : int (default=1000)
        The number of samples in the dataset.
    n_features : int (default=100)
        The number of features in the dataset.
    random_state : int (default=None)
        A fixed random_state to pass through to all methods that have a random_state hyperparameter. 
    return_choice_pipeline : bool (default=True)
        If False, returns a list of TPOT2.search_spaces.nodes.EstimatorNode objects.
        If True, returns a single TPOT2.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.
    base_node: TPOT2.search_spaces.base.SearchSpace (default=TPOT2.search_spaces.nodes.EstimatorNode)
        The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    Returns
    -------
        Returns an SearchSpace object that can be optimized by TPOT.
        - TPOT2.search_spaces.nodes.EstimatorNode (or base_node) if there is only one search space.
        - List of TPOT2.search_spaces.nodes.EstimatorNode (or base_node) objects if there are multiple search spaces.
        - TPOT2.search_spaces.pipelines.ChoicePipeline object if return_choice_pipeline is True.
        Note: for some special cases with methods using wrapped estimators, the returned search space is a TPOT2.search_spaces.pipelines.WrapperPipeline object.

    """
    name = flatten_group_names(name)

    #if list of names, return a list of EstimatorNodes
    if isinstance(name, list) or isinstance(name, np.ndarray):
        search_spaces = [get_search_space(n, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, return_choice_pipeline=False, base_node=base_node, n_jobs=n_jobs) for n in name]
        #remove Nones
        search_spaces = [s for s in search_spaces if s is not None]

        if return_choice_pipeline:
            return ChoicePipeline(search_spaces=np.hstack(search_spaces))
        else:
            return np.hstack(search_spaces)

    # if name in GROUPNAMES:
    #     name_list = GROUPNAMES[name]
    #     return get_search_space(name_list, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, return_choice_pipeline=return_choice_pipeline, base_node=base_node)

    return get_node(name, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, base_node=base_node, n_jobs=n_jobs)