AutoGluon 表格 - 基本功能¶

通过一个简单的fit()调用，AutoGluon可以生成高度准确的模型，用于基于数据表中其他列的值来预测某一列的值。使用AutoGluon处理表格数据，适用于分类和回归问题。本教程演示了如何使用AutoGluon生成一个分类模型，预测一个人的收入是否超过50,000美元。

表格预测器¶

首先，导入AutoGluon的TabularPredictor和TabularDataset类：

from autogluon.tabular import TabularDataset, TabularPredictor

从CSV文件加载训练数据到AutoGluon数据集对象中。该对象基本上等同于Pandas DataFrame，并且可以应用相同的方法。

train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
subsample_size = 500  # subsample subset of data for faster demo, try setting this to much larger values
train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head()

	age	workclass	fnlwgt	education	education-num	marital-status	occupation	relationship	race	sex	capital-gain	capital-loss	hours-per-week	native-country	class
6118	51	Private	39264	Some-college	10	Married-civ-spouse	Exec-managerial	Wife	White	Female	0	0	40	United-States	>50K
23204	58	Private	51662	10th	6	Married-civ-spouse	Other-service	Wife	White	Female	0	0	8	United-States	<=50K
29590	40	Private	326310	Some-college	10	Married-civ-spouse	Craft-repair	Husband	White	Male	0	0	44	United-States	<=50K
18116	37	Private	222450	HS-grad	9	Never-married	Sales	Not-in-family	White	Male	0	2339	40	El-Salvador	<=50K
33964	62	Private	109190	Bachelors	13	Married-civ-spouse	Exec-managerial	Husband	White	Male	15024	0	40	United-States	>50K

请注意，我们从存储在云端的CSV文件中加载了数据。如果您已经将CSV文件下载到自己的机器上（例如，使用wget），您也可以指定一个本地文件路径。表格train_data中的每一行对应一个训练样本。在这个特定的数据集中，每一行对应一个个体，列中包含在人口普查期间报告的各种特征。

首先，我们使用这些特征来预测一个人的收入是否超过50,000美元，这记录在此表的class列中。

label = 'class'
print(f"Unique classes: {list(train_data[label].unique())}")

Unique classes: [' >50K', ' <=50K']

AutoGluon 可以直接处理原始数据，这意味着在拟合 AutoGluon 之前，您不需要进行任何数据预处理。我们强烈建议您避免执行诸如缺失值插补或独热编码等操作，因为 AutoGluon 有专门的逻辑来自动处理这些情况。您可以在特征工程教程中了解更多关于 AutoGluon 预处理的信息。

训练¶

现在我们用一行代码初始化和拟合AutoGluon的TabularPredictor：

predictor = TabularPredictor(label=label).fit(train_data)

Show code cell output Hide code cell output

No path specified. Models will be saved in: "AutogluonModels/ag-20241127_095251"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.2b20241127
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Tue Sep 24 10:00:37 UTC 2024
CPU Count:          8
Memory Avail:       28.81 GB / 30.95 GB (93.1%)
Disk Space Avail:   213.38 GB / 255.99 GB (83.4%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ...
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095251"
Train Data Rows:    500
Train Data Columns: 14
Label Column:       class
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values:  [' >50K', ' <=50K']
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       binary
Preprocessing data ...
Selected class <--> label mapping:  class 1 =  >50K, class 0 =  <=50K
Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive ( >50K) vs negative ( <=50K) class.
	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory:                    29494.60 MB
Train Data (Original)  Memory Usage: 0.28 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('int', ['bool']) : 1 | ['sex']
0.1s = Fit runtime
14 features in original data used to generate 14 features in processed data.
Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.08s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 400, Val Rows: 100
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ...
0.73	 = Validation score   (accuracy)
0.04s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: KNeighborsDist ...
0.65	 = Validation score   (accuracy)
0.01s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: LightGBMXT ...
0.83	 = Validation score   (accuracy)
0.24s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: LightGBM ...
0.85	 = Validation score   (accuracy)
0.21s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: RandomForestGini ...
0.84	 = Validation score   (accuracy)
0.63s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: RandomForestEntr ...
0.83	 = Validation score   (accuracy)
0.53s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: CatBoost ...
0.85	 = Validation score   (accuracy)
0.8s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: ExtraTreesGini ...
0.82	 = Validation score   (accuracy)
0.54s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: ExtraTreesEntr ...
0.81	 = Validation score   (accuracy)
0.51s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
0.84	 = Validation score   (accuracy)
2.66s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: XGBoost ...
0.86	 = Validation score   (accuracy)
0.25s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: NeuralNetTorch ...
0.83	 = Validation score   (accuracy)
2.16s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: LightGBMLarge ...
0.83	 = Validation score   (accuracy)
0.5s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
Ensemble Weights: {'XGBoost': 1.0}
0.86	 = Validation score   (accuracy)
0.1s	 = Training   runtime
0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 9.66s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 14568.6 rows/s (100 batch size)
Disabling decision threshold calibration for metric `accuracy` due to having fewer than 10000 rows of validation data for calibration, to avoid overfitting (100 rows).
	`accuracy` is generally not improved through threshold calibration. Force calibration via specifying `calibrate_decision_threshold=True`.
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095251")

就是这样！我们现在有了一个能够对新数据进行预测的TabularPredictor。

预测¶

接下来，加载单独的测试数据以演示在推理时如何对新示例进行预测：

test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
test_data.head()

Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769

	age	workclass	fnlwgt	education	education-num	marital-status	occupation	relationship	race	sex	capital-loss	hours-per-week	native-country	class
0	31	Private	169085	11th	7	Married-civ-spouse	Sales	Wife	White	Female	0	20	United-States	<=50K
1	17	Self-emp-not-inc	226203	12th	8	Never-married	Sales	Own-child	White	Male	0	45	United-States	<=50K
2	47	Private	54260	Assoc-voc	11	Married-civ-spouse	Exec-managerial	Husband	White	Male	1887	60	United-States	>50K
3	21	Private	176262	Some-college	10	Never-married	Exec-managerial	Own-child	White	Female	0	30	United-States	<=50K
4	17	Private	241185	12th	8	Never-married	Prof-specialty	Own-child	White	Male	0	20	United-States	<=50K

我们现在可以使用我们训练好的模型对新数据进行预测：

y_pred = predictor.predict(test_data)
y_pred.head()  # Predictions

   <=50K
   <=50K
    >50K
   <=50K
   <=50K
Name: class, dtype: object

y_pred_proba = predictor.predict_proba(test_data)
y_pred_proba.head()  # Prediction Probabilities

	<=50K	>50K
0	0.981126	0.018874
1	0.983599	0.016401
2	0.478133	0.521867
3	0.994751	0.005249
4	0.988539	0.011461

评估¶

接下来，我们可以在（已标记的）测试数据上evaluate预测器：

predictor.evaluate(test_data)

{'accuracy': 0.8409253761899887,
 'balanced_accuracy': 0.7475663839529563,
 'mcc': 0.5345297121913682,
 'roc_auc': 0.884716037791454,
 'f1': 0.6296472831267874,
 'precision': 0.7034078807241747,
 'recall': 0.5698878343399483}

我们也可以单独评估每个模型：

predictor.leaderboard(test_data)

	model	score_test	score_val	eval_metric	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	RandomForestGini	0.842870	0.84	accuracy	0.109537	0.046481	0.631930	0.109537	0.046481	0.631930	1	True	5
1	CatBoost	0.842461	0.85	accuracy	0.007121	0.003620	0.797609	0.007121	0.003620	0.797609	1	True	7
2	RandomForestEntr	0.841130	0.83	accuracy	0.103722	0.046747	0.530128	0.103722	0.046747	0.530128	1	True	6
3	XGBoost	0.840925	0.86	accuracy	0.064755	0.006139	0.247344	0.064755	0.006139	0.247344	1	True	11
4	WeightedEnsemble_L2	0.840925	0.86	accuracy	0.066142	0.006864	0.348070	0.001387	0.000725	0.100726	2	True	14
5	LightGBM	0.839799	0.85	accuracy	0.020512	0.003161	0.205523	0.020512	0.003161	0.205523	1	True	4
6	LightGBMXT	0.836421	0.83	accuracy	0.011375	0.003438	0.235818	0.011375	0.003438	0.235818	1	True	3
7	ExtraTreesEntr	0.833862	0.81	accuracy	0.088769	0.047089	0.511623	0.088769	0.047089	0.511623	1	True	9
8	ExtraTreesGini	0.833862	0.82	accuracy	0.096716	0.046634	0.535834	0.096716	0.046634	0.535834	1	True	8
9	NeuralNetTorch	0.833657	0.83	accuracy	0.046826	0.009761	2.163448	0.046826	0.009761	2.163448	1	True	12
10	NeuralNetFastAI	0.828949	0.84	accuracy	0.141892	0.012233	2.663757	0.141892	0.012233	2.663757	1	True	10
11	LightGBMLarge	0.817074	0.83	accuracy	0.012514	0.003539	0.503972	0.012514	0.003539	0.503972	1	True	13
12	KNeighborsUnif	0.725970	0.73	accuracy	0.015962	0.014863	0.035771	0.015962	0.014863	0.035771	1	True	1
13	KNeighborsDist	0.695158	0.65	accuracy	0.033554	0.013386	0.010074	0.033554	0.013386	0.010074	1	True	2

加载训练好的预测器¶

最后，我们可以在新会话（或新机器）中通过调用TabularPredictor.load()并指定预测器工件在磁盘上的位置来加载预测器。

predictor.path  # The path on disk where the predictor is saved

'/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095251'

# Load the predictor by specifying the path it is saved to on disk.
# You can control where it is saved to by setting the `path` parameter during init
predictor = TabularPredictor.load(predictor.path)

警告

TabularPredictor.load() 隐式使用了 pickle 模块，这已知是不安全的。有可能构造恶意的 pickle 数据，在反序列化期间执行任意代码。切勿加载可能来自不可信来源的数据，或可能被篡改的数据。只加载你信任的数据。

现在你已经准备好在自己的表格数据集上尝试AutoGluon了！只要它们以常见的格式（如CSV）存储，你应该能够仅用两行代码就实现强大的预测性能：

from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label=<variable-name>).fit(train_data=<file-name>)

注意： 这个简单的调用 TabularPredictor.fit() 是为了您的第一个原型模型。在后续部分中，我们将展示如何通过额外指定 presets 参数到 fit() 和 eval_metric 参数到 TabularPredictor() 来最大化预测性能。

fit()的描述¶

这里我们讨论在fit()期间发生了什么。

由于class变量只有两个可能的值，这是一个二分类问题，适合的性能指标是准确率。AutoGluon 会自动推断这一点以及每个特征的类型（即哪些列包含连续数字与离散类别）。AutoGluon 还可以自动处理常见问题，如缺失数据和重新缩放特征值。

我们没有指定单独的验证数据，因此AutoGluon自动选择了数据的随机训练/验证分割。用于验证的数据与训练数据分开，并用于确定产生最佳结果的模型和超参数值。AutoGluon不仅仅训练一个模型，而是训练多个模型并将它们集成在一起，以获得更优越的预测性能。

默认情况下，AutoGluon 尝试拟合各种类型的模型，包括神经网络和树集成。每种类型的模型都有各种超参数，传统上，用户需要手动指定这些参数。AutoGluon 自动化了这一过程。

AutoGluon 自动且迭代地测试超参数值，以在验证数据上产生最佳性能。这涉及在不同超参数设置下重复训练模型并评估其性能。这个过程可能计算密集，因此 fit() 使用 Ray 在多个线程上并行化此过程。为了控制运行时间，您可以在 fit() 中指定各种参数，例如 time_limit，如后续的 深入教程 所示。

我们可以查看AutoGluon自动推断出的关于我们预测任务的属性：

print("AutoGluon infers problem type is: ", predictor.problem_type)
print("AutoGluon identified the following types of features:")
print(predictor.feature_metadata)

AutoGluon infers problem type is:  binary
AutoGluon identified the following types of features:
('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('int', ['bool']) : 1 | ['sex']

AutoGluon 正确地将我们的预测问题识别为二分类任务，并决定诸如age之类的变量应表示为整数，而诸如workclass之类的变量应表示为分类对象。feature_metadata属性允许您在预处理后查看每个预测变量的推断数据类型（这是其原始数据类型；如果通过特征工程生成，某些特征可能还与额外的特殊数据类型相关联，例如日期时间/文本列的数字表示）。

要将数据转换为AutoGluon的内部表示，我们可以执行以下操作：

test_data_transform = predictor.transform_features(test_data)
test_data_transform.head()

	age	fnlwgt	education-num	sex	capital-loss	hours-per-week	workclass	education	marital-status	occupation	relationship	race	native-country
0	31	169085	7	0	0	20	3	1	1	10	5	4	14
1	17	226203	8	1	0	45	5	2	3	10	3	4	14
2	47	54260	11	1	1887	60	3	7	1	3	0	4	14
3	21	176262	10	0	0	30	3	13	3	3	3	4	14
4	17	241185	8	1	0	20	3	2	3	8	3	4	14

注意数据在预处理后是纯数字的（尽管分类特征在下游仍将被视为分类）。

为了更好地理解我们训练好的预测器，我们可以通过TabularPredictor.feature_importance()来估计每个特征的整体重要性：

predictor.feature_importance(test_data)

Computing feature importance via permutation shuffling for 14 features using 5000 rows with 5 shuffle sets...
5.14s	= Expected runtime (1.03s per shuffle set)
2.17s	= Actual runtime (Completed 5 of 5 shuffle sets)

	importance	stddev	p_value	n	p99_high	p99_low
marital-status	0.05080	0.003792	3.698489e-06	5	0.058608	0.042992
capital-gain	0.03852	0.002318	1.565361e-06	5	0.043292	0.033748
education-num	0.02968	0.001346	5.063512e-07	5	0.032452	0.026908
age	0.01500	0.002850	1.490440e-04	5	0.020867	0.009133
hours-per-week	0.01172	0.003974	1.369430e-03	5	0.019902	0.003538
occupation	0.00528	0.001803	1.406849e-03	5	0.008993	0.001567
relationship	0.00472	0.001154	3.967984e-04	5	0.007096	0.002344
native-country	0.00144	0.000654	3.959537e-03	5	0.002787	0.000093
capital-loss	0.00128	0.000415	1.155921e-03	5	0.002134	0.000426
fnlwgt	0.00108	0.002361	1.820562e-01	5	0.005940	-0.003780
sex	0.00096	0.001090	6.012167e-02	5	0.003204	-0.001284
workclass	0.00092	0.001635	1.383281e-01	5	0.004286	-0.002446
education	0.00080	0.001463	1.442554e-01	5	0.003812	-0.002212
race	0.00048	0.000559	6.352320e-02	5	0.001630	-0.000670

importance 列是一个估计值，表示如果从数据中移除该特征，评估指标分数会下降多少。 importance 的负值意味着如果移除该特征后重新拟合，可能会改善结果。

当我们调用predict()时，AutoGluon会自动使用在验证数据上表现最佳的模型（即加权集成）进行预测。

predictor.model_best

'WeightedEnsemble_L2'

我们可以改为指定用于预测的模型，如下所示：

predictor.predict(test_data, model='LightGBM')

你可以通过.leaderboard()或.model_names()获取训练模型的列表：

predictor.model_names()

['KNeighborsUnif',
 'KNeighborsDist',
 'LightGBMXT',
 'LightGBM',
 'RandomForestGini',
 'RandomForestEntr',
 'CatBoost',
 'ExtraTreesGini',
 'ExtraTreesEntr',
 'NeuralNetFastAI',
 'XGBoost',
 'NeuralNetTorch',
 'LightGBMLarge',
 'WeightedEnsemble_L2']

上述预测性能的得分基于默认的评估指标（二元分类的准确率）。在某些应用中，性能可能通过不同于AutoGluon默认优化的指标来衡量。如果您知道在您的应用中重要的指标，您应该通过eval_metric参数指定它，如下一节所示。

预设¶

AutoGluon 提供了多种预设，可以通过 .fit 调用中的 presets 参数来指定。默认情况下使用 medium_quality 以鼓励初始原型设计，但在实际使用中，应使用其他预设。

预设	模型质量	使用案例	拟合时间（理想）	推理时间（相对于中等质量）	磁盘使用情况
最佳质量	最先进的（SOTA），远高于高质量	当准确性至关重要时	16x+	32x+	16x+
高质量	比good_quality更好	当需要一个非常强大、便携且具有快速推理能力的解决方案时：大规模批量推理	16x+	4x	2x
良好质量	比其他任何AutoML框架更强大	当需要一个强大、高度便携且推理速度非常快的解决方案时：十亿规模的批量推理、低于100毫秒的在线推理、边缘设备	16倍	2x	0.1x
中等质量	与其他顶级AutoML框架竞争	初步原型设计，建立性能基准	1x	1x	1x

我们建议用户从medium_quality开始，以了解问题并识别任何与数据相关的问题。如果medium_quality训练时间过长，考虑在此原型阶段对训练数据进行子采样。
一旦您感到舒适，接下来尝试best_quality。确保至少指定medium_quality中使用的time_limit值的16倍。完成后，您应该会得到一个非常强大的解决方案，通常比medium_quality更强。
确保考虑保留AutoGluon在训练期间从未见过的测试数据，以确保模型在性能方面表现如预期。
一旦您评估了best_quality和medium_quality，检查是否满足您的需求。如果都不满足，考虑尝试high_quality和/或good_quality。
如果预设选项都不满足要求，请参考Predicting Columns in a Table - In Depth以获取更高级的AutoGluon选项。

最大化预测性能¶

注意： 如果你正在对AutoGluon-Tabular进行基准测试或希望最大化其准确性，你不应该完全使用默认参数调用fit()！为了获得AutoGluon的最佳预测准确性，你通常应该这样使用它：

time_limit = 60  # for quick demonstration only, you should set this to longest time you are willing to wait (in seconds)
metric = 'roc_auc'  # specify your evaluation metric here
predictor = TabularPredictor(label, eval_metric=metric).fit(train_data, time_limit=time_limit, presets='best_quality')

Show code cell output Hide code cell output

(_ray_fit pid=7759) [1000]	valid_set's binary_logloss: 0.270008
(_ray_fit pid=7759) [2000]	valid_set's binary_logloss: 0.252973

No path specified. Models will be saved in: "AutogluonModels/ag-20241127_095304"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.2b20241127
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Tue Sep 24 10:00:37 UTC 2024
CPU Count:          8
Memory Avail:       28.34 GB / 30.95 GB (91.6%)
Disk Space Avail:   213.36 GB / 255.99 GB (83.3%)
===================================================
Presets specified: ['best_quality']
Setting dynamic_stacking from 'auto' to True. Reason: Enable dynamic_stacking when use_bag_holdout is disabled. (use_bag_holdout=False)
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=1
DyStack is enabled (dynamic_stacking=True). AutoGluon will try to determine whether the input data is affected by stacked overfitting and enable or disable stacking as a consequence.
This is used to identify the optimal `num_stack_levels` value. Copies of AutoGluon will be fit on subsets of the data. Then holdout validation data is used to detect stacked overfitting.
Running DyStack for up to 15s of the 60s of remaining time (25%).
Running DyStack sub-fit in a ray process to avoid memory leakage. Enabling ray logging (enable_ray_logging=True). Specify `ds_args={'enable_ray_logging': False}` if you experience logging issues.
2024-11-27 09:53:06,440	INFO worker.py:1810 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265 
Context path: "/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095304/ds_sub_fit/sub_fit_ho"
(_dystack pid=7331) Running DyStack sub-fit ...
(_dystack pid=7331) Beginning AutoGluon training ... Time limit = 12s
(_dystack pid=7331) AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095304/ds_sub_fit/sub_fit_ho"
(_dystack pid=7331) Train Data Rows:    444
(_dystack pid=7331) Train Data Columns: 14
(_dystack pid=7331) Label Column:       class
(_dystack pid=7331) Problem Type:       binary
(_dystack pid=7331) Preprocessing data ...
(_dystack pid=7331) Selected class <--> label mapping:  class 1 =  >50K, class 0 =  <=50K
(_dystack pid=7331) 	Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive ( >50K) vs negative ( <=50K) class.
(_dystack pid=7331) 	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
(_dystack pid=7331) Using Feature Generators to preprocess the data ...
(_dystack pid=7331) Fitting AutoMLPipelineFeatureGenerator...
(_dystack pid=7331) 	Available Memory:                    28461.80 MB
(_dystack pid=7331) 	Train Data (Original)  Memory Usage: 0.25 MB (0.0% of available memory)
(_dystack pid=7331) 	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
(_dystack pid=7331) 	Stage 1 Generators:
(_dystack pid=7331) 		Fitting AsTypeFeatureGenerator...
(_dystack pid=7331) 			Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
(_dystack pid=7331) 	Stage 2 Generators:
(_dystack pid=7331) 		Fitting FillNaFeatureGenerator...
(_dystack pid=7331) 	Stage 3 Generators:
(_dystack pid=7331) 		Fitting IdentityFeatureGenerator...
(_dystack pid=7331) 		Fitting CategoryFeatureGenerator...
(_dystack pid=7331) 			Fitting CategoryMemoryMinimizeFeatureGenerator...
(_dystack pid=7331) 	Stage 4 Generators:
(_dystack pid=7331) 		Fitting DropUniqueFeatureGenerator...
(_dystack pid=7331) 	Stage 5 Generators:
(_dystack pid=7331) 		Fitting DropDuplicatesFeatureGenerator...
(_dystack pid=7331) 	Types of features in original data (raw dtype, special dtypes):
(_dystack pid=7331) 		('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
(_dystack pid=7331) 		('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
(_dystack pid=7331) 	Types of features in processed data (raw dtype, special dtypes):
(_dystack pid=7331) 		('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
(_dystack pid=7331) 		('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
(_dystack pid=7331) 		('int', ['bool']) : 1 | ['sex']
(_dystack pid=7331) 	0.1s = Fit runtime
(_dystack pid=7331) 	14 features in original data used to generate 14 features in processed data.
(_dystack pid=7331) 	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
(_dystack pid=7331) Data preprocessing and feature engineering runtime = 0.07s ...
(_dystack pid=7331) AutoGluon will gauge predictive performance using evaluation metric: 'roc_auc'
(_dystack pid=7331) 	This metric expects predicted probabilities rather than predicted class labels, so you'll need to use predict_proba() instead of predict()
(_dystack pid=7331) 	To change this, specify the eval_metric parameter of Predictor()
(_dystack pid=7331) Large model count detected (112 configs) ... Only displaying the first 3 models of each family. To see all, set `verbosity=3`.
(_dystack pid=7331) User-specified model hyperparameters to be fit:
(_dystack pid=7331) {
(_dystack pid=7331) 	'NN_TORCH': [{}, {'activation': 'elu', 'dropout_prob': 0.10077639529843717, 'hidden_size': 108, 'learning_rate': 0.002735937344002146, 'num_layers': 4, 'use_batchnorm': True, 'weight_decay': 1.356433327634438e-12, 'ag_args': {'name_suffix': '_r79', 'priority': -2}}, {'activation': 'elu', 'dropout_prob': 0.11897478034205347, 'hidden_size': 213, 'learning_rate': 0.0010474382260641949, 'num_layers': 4, 'use_batchnorm': False, 'weight_decay': 5.594471067786272e-10, 'ag_args': {'name_suffix': '_r22', 'priority': -7}}],
(_dystack pid=7331) 	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
(_dystack pid=7331) 	'CAT': [{}, {'depth': 6, 'grow_policy': 'SymmetricTree', 'l2_leaf_reg': 2.1542798306067823, 'learning_rate': 0.06864209415792857, 'max_ctr_complexity': 4, 'one_hot_max_size': 10, 'ag_args': {'name_suffix': '_r177', 'priority': -1}}, {'depth': 8, 'grow_policy': 'Depthwise', 'l2_leaf_reg': 2.7997999596449104, 'learning_rate': 0.031375015734637225, 'max_ctr_complexity': 2, 'one_hot_max_size': 3, 'ag_args': {'name_suffix': '_r9', 'priority': -5}}],
(_dystack pid=7331) 	'XGB': [{}, {'colsample_bytree': 0.6917311125174739, 'enable_categorical': False, 'learning_rate': 0.018063876087523967, 'max_depth': 10, 'min_child_weight': 0.6028633586934382, 'ag_args': {'name_suffix': '_r33', 'priority': -8}}, {'colsample_bytree': 0.6628423832084077, 'enable_categorical': False, 'learning_rate': 0.08775715546881824, 'max_depth': 5, 'min_child_weight': 0.6294123374222513, 'ag_args': {'name_suffix': '_r89', 'priority': -16}}],
(_dystack pid=7331) 	'FASTAI': [{}, {'bs': 256, 'emb_drop': 0.5411770367537934, 'epochs': 43, 'layers': [800, 400], 'lr': 0.01519848858318159, 'ps': 0.23782946566604385, 'ag_args': {'name_suffix': '_r191', 'priority': -4}}, {'bs': 2048, 'emb_drop': 0.05070411322605811, 'epochs': 29, 'layers': [200, 100], 'lr': 0.08974235041576624, 'ps': 0.10393466140748028, 'ag_args': {'name_suffix': '_r102', 'priority': -11}}],
(_dystack pid=7331) 	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
(_dystack pid=7331) 	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
(_dystack pid=7331) 	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
(_dystack pid=7331) }
(_dystack pid=7331) AutoGluon will fit 2 stack levels (L1 to L2) ...
(_dystack pid=7331) Fitting 110 L1 models, fit_strategy="sequential" ...
(_dystack pid=7331) Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 8.20s of the 12.29s of remaining time.
(_dystack pid=7331) 	0.5271	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.0s	 = Training   runtime
(_dystack pid=7331) 	0.02s	 = Validation runtime
(_dystack pid=7331) Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 8.14s of the 12.23s of remaining time.
(_dystack pid=7331) 	0.5389	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.0s	 = Training   runtime
(_dystack pid=7331) 	0.01s	 = Validation runtime
(_dystack pid=7331) Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 8.11s of the 12.20s of remaining time.
(_dystack pid=7331) 	Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
(_dystack pid=7331) 	0.8895	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.9s	 = Training   runtime
(_dystack pid=7331) 	0.06s	 = Validation runtime
(_dystack pid=7331) Fitting model: LightGBM_BAG_L1 ... Training model for up to 4.55s of the 8.65s of remaining time.
(_dystack pid=7331) 	Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
(_dystack pid=7331) 	0.8693	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.72s	 = Training   runtime
(_dystack pid=7331) 	0.04s	 = Validation runtime
(_dystack pid=7331) Fitting model: RandomForestGini_BAG_L1 ... Training model for up to 1.10s of the 5.20s of remaining time.
(_dystack pid=7331) 	0.8678	 = Validation score   (roc_auc)
(_dystack pid=7331) 	1.02s	 = Training   runtime
(_dystack pid=7331) 	0.1s	 = Validation runtime
(_dystack pid=7331) Fitting model: WeightedEnsemble_L2 ... Training model for up to 12.30s of the 3.90s of remaining time.
(_dystack pid=7331) 	Ensemble Weights: {'LightGBMXT_BAG_L1': 0.87, 'RandomForestGini_BAG_L1': 0.13}
(_dystack pid=7331) 	0.8904	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.02s	 = Training   runtime
(_dystack pid=7331) 	0.0s	 = Validation runtime
(_dystack pid=7331) Fitting 108 L2 models, fit_strategy="sequential" ...
(_dystack pid=7331) Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 3.85s of the 3.83s of remaining time.
(_dystack pid=7331) 	Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
(_dystack pid=7331) 	0.8789	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.66s	 = Training   runtime
(_dystack pid=7331) 	0.05s	 = Validation runtime
(_dystack pid=7331) Fitting model: LightGBM_BAG_L2 ... Training model for up to 0.61s of the 0.59s of remaining time.
(_dystack pid=7331) 	Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
(_dystack pid=7331) 	0.8716	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.9s	 = Training   runtime
(_dystack pid=7331) 	0.04s	 = Validation runtime
(_dystack pid=7331) Fitting model: WeightedEnsemble_L3 ... Training model for up to 12.30s of the -3.39s of remaining time.
(_dystack pid=7331) 	Ensemble Weights: {'LightGBMXT_BAG_L1': 0.87, 'RandomForestGini_BAG_L1': 0.13}
(_dystack pid=7331) 	0.8904	 = Validation score   (roc_auc)
(_dystack pid=7331) 	0.05s	 = Training   runtime
(_dystack pid=7331) 	0.0s	 = Validation runtime
(_dystack pid=7331) AutoGluon training complete, total runtime = 15.86s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 778.4 rows/s (56 batch size)
(_dystack pid=7331) TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095304/ds_sub_fit/sub_fit_ho")
(_dystack pid=7331) Deleting DyStack predictor artifacts (clean_up_fits=True) ...
Leaderboard on holdout data (DyStack):
model  score_holdout  score_val eval_metric  pred_time_test  pred_time_val  fit_time  pred_time_test_marginal  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0  RandomForestGini_BAG_L1       0.928455   0.867773     roc_auc        0.064141       0.099894  1.021615                 0.064141                0.099894           1.021615            1       True          5
1        LightGBMXT_BAG_L2       0.921951   0.878884     roc_auc        0.404826       0.219545  2.584296                 0.044990                0.047225           0.656554            2       True          7
2      WeightedEnsemble_L3       0.921951   0.890355     roc_auc        0.345656       0.159769  1.977327                 0.001909                0.000580           0.052645            3       True          9
3      WeightedEnsemble_L2       0.921951   0.890355     roc_auc        0.346192       0.159604  1.947119                 0.002445                0.000415           0.022437            2       True          6
4        LightGBMXT_BAG_L1       0.918699   0.889480     roc_auc        0.279606       0.059295  0.903067                 0.279606                0.059295           0.903067            1       True          3
5          LightGBM_BAG_L2       0.912195   0.871644     roc_auc        0.397977       0.209903  2.823818                 0.038140                0.037583           0.896076            2       True          8
6          LightGBM_BAG_L1       0.897561   0.869264     roc_auc        0.039595       0.036889  0.722275                 0.039595                0.036889           0.722275            1       True          4
7    KNeighborsUnif_BAG_L1       0.573171   0.527070     roc_auc        0.098947       0.016106  0.003523                 0.098947                0.016106           0.003523            1       True          1
8    KNeighborsDist_BAG_L1       0.556098   0.538940     roc_auc        0.016090       0.013130  0.003059                 0.016090                0.013130           0.003059            1       True          2
1	 = Optimal   num_stack_levels (Stacked Overfitting Occurred: False)
21s	 = DyStack   runtime |	39s	 = Remaining runtime
Starting main fit with num_stack_levels=1.
	For future fit calls on this dataset, you can skip DyStack to save time: `predictor.fit(..., dynamic_stacking=False, num_stack_levels=1)`
Beginning AutoGluon training ... Time limit = 39s
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095304"
Train Data Rows:    500
Train Data Columns: 14
Label Column:       class
Problem Type:       binary
Preprocessing data ...
Selected class <--> label mapping:  class 1 =  >50K, class 0 =  <=50K
Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive ( >50K) vs negative ( <=50K) class.
	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory:                    28407.82 MB
Train Data (Original)  Memory Usage: 0.28 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', [])    : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])       : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('int', ['bool']) : 1 | ['sex']
0.1s = Fit runtime
14 features in original data used to generate 14 features in processed data.
Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.1s ...
AutoGluon will gauge predictive performance using evaluation metric: 'roc_auc'
This metric expects predicted probabilities rather than predicted class labels, so you'll need to use predict_proba() instead of predict()
To change this, specify the eval_metric parameter of Predictor()
Large model count detected (112 configs) ... Only displaying the first 3 models of each family. To see all, set `verbosity=3`.
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}, {'activation': 'elu', 'dropout_prob': 0.10077639529843717, 'hidden_size': 108, 'learning_rate': 0.002735937344002146, 'num_layers': 4, 'use_batchnorm': True, 'weight_decay': 1.356433327634438e-12, 'ag_args': {'name_suffix': '_r79', 'priority': -2}}, {'activation': 'elu', 'dropout_prob': 0.11897478034205347, 'hidden_size': 213, 'learning_rate': 0.0010474382260641949, 'num_layers': 4, 'use_batchnorm': False, 'weight_decay': 5.594471067786272e-10, 'ag_args': {'name_suffix': '_r22', 'priority': -7}}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}, {'depth': 6, 'grow_policy': 'SymmetricTree', 'l2_leaf_reg': 2.1542798306067823, 'learning_rate': 0.06864209415792857, 'max_ctr_complexity': 4, 'one_hot_max_size': 10, 'ag_args': {'name_suffix': '_r177', 'priority': -1}}, {'depth': 8, 'grow_policy': 'Depthwise', 'l2_leaf_reg': 2.7997999596449104, 'learning_rate': 0.031375015734637225, 'max_ctr_complexity': 2, 'one_hot_max_size': 3, 'ag_args': {'name_suffix': '_r9', 'priority': -5}}],
	'XGB': [{}, {'colsample_bytree': 0.6917311125174739, 'enable_categorical': False, 'learning_rate': 0.018063876087523967, 'max_depth': 10, 'min_child_weight': 0.6028633586934382, 'ag_args': {'name_suffix': '_r33', 'priority': -8}}, {'colsample_bytree': 0.6628423832084077, 'enable_categorical': False, 'learning_rate': 0.08775715546881824, 'max_depth': 5, 'min_child_weight': 0.6294123374222513, 'ag_args': {'name_suffix': '_r89', 'priority': -16}}],
	'FASTAI': [{}, {'bs': 256, 'emb_drop': 0.5411770367537934, 'epochs': 43, 'layers': [800, 400], 'lr': 0.01519848858318159, 'ps': 0.23782946566604385, 'ag_args': {'name_suffix': '_r191', 'priority': -4}}, {'bs': 2048, 'emb_drop': 0.05070411322605811, 'epochs': 29, 'layers': [200, 100], 'lr': 0.08974235041576624, 'ps': 0.10393466140748028, 'ag_args': {'name_suffix': '_r102', 'priority': -11}}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 110 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 26.10s of the 39.15s of remaining time.
0.5196	 = Validation score   (roc_auc)
0.0s	 = Training   runtime
0.02s	 = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 26.07s of the 39.12s of remaining time.
0.537	 = Validation score   (roc_auc)
0.0s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 26.05s of the 39.09s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
0.8912	 = Validation score   (roc_auc)
0.76s	 = Training   runtime
0.04s	 = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 22.69s of the 35.74s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.01%)
0.8799	 = Validation score   (roc_auc)
1.03s	 = Training   runtime
0.04s	 = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ... Training model for up to 18.74s of the 31.79s of remaining time.
0.8879	 = Validation score   (roc_auc)
0.79s	 = Training   runtime
0.09s	 = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ... Training model for up to 17.83s of the 30.88s of remaining time.
0.8899	 = Validation score   (roc_auc)
0.55s	 = Training   runtime
0.1s	 = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 17.16s of the 30.21s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.02%)
0.8902	 = Validation score   (roc_auc)
5.66s	 = Training   runtime
0.04s	 = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ... Training model for up to 8.90s of the 21.94s of remaining time.
0.8958	 = Validation score   (roc_auc)
0.59s	 = Training   runtime
0.1s	 = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ... Training model for up to 8.19s of the 21.23s of remaining time.
0.8904	 = Validation score   (roc_auc)
0.54s	 = Training   runtime
0.1s	 = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 7.52s of the 20.57s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.00%)
0.8701	 = Validation score   (roc_auc)
4.76s	 = Training   runtime
0.11s	 = Validation runtime
Fitting model: XGBoost_BAG_L1 ... Training model for up to 0.20s of the 13.25s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.02%)
0.8497	 = Validation score   (roc_auc)
0.89s	 = Training   runtime
0.06s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 39.16s of the 8.61s of remaining time.
Ensemble Weights: {'ExtraTreesGini_BAG_L1': 0.35, 'XGBoost_BAG_L1': 0.25, 'LightGBMXT_BAG_L1': 0.15, 'CatBoost_BAG_L1': 0.15, 'NeuralNetFastAI_BAG_L1': 0.1}
0.9044	 = Validation score   (roc_auc)
0.07s	 = Training   runtime
0.0s	 = Validation runtime
Fitting 108 L2 models, fit_strategy="sequential" ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 8.52s of the 8.47s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.02%)
0.8851	 = Validation score   (roc_auc)
0.74s	 = Training   runtime
0.04s	 = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 4.92s of the 4.87s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (8 workers, per: cpus=1, gpus=0, memory=0.02%)
0.874	 = Validation score   (roc_auc)
1.0s	 = Training   runtime
0.04s	 = Validation runtime
Fitting model: RandomForestGini_BAG_L2 ... Training model for up to 0.88s of the 0.83s of remaining time.
0.8701	 = Validation score   (roc_auc)
0.8s	 = Training   runtime
0.1s	 = Validation runtime
Fitting model: WeightedEnsemble_L3 ... Training model for up to 39.16s of the -0.23s of remaining time.
Ensemble Weights: {'ExtraTreesGini_BAG_L1': 0.35, 'XGBoost_BAG_L1': 0.25, 'LightGBMXT_BAG_L1': 0.15, 'CatBoost_BAG_L1': 0.15, 'NeuralNetFastAI_BAG_L1': 0.1}
0.9044	 = Validation score   (roc_auc)
0.05s	 = Training   runtime
0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 39.55s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 244.1 rows/s (63 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/AutogluonModels/ag-20241127_095304")

predictor.leaderboard(test_data)

	model	score_test	score_val	eval_metric	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	CatBoost_BAG_L1	0.902618	0.890228	roc_auc	0.048916	0.039893	5.664868	0.048916	0.039893	5.664868	1	True	7
1	LightGBMXT_BAG_L1	0.900085	0.891223	roc_auc	0.233182	0.042220	0.760008	0.233182	0.042220	0.760008	1	True	3
2	LightGBMXT_BAG_L2	0.899894	0.885114	roc_auc	2.182979	0.499714	13.958774	0.099028	0.040373	0.740772	2	True	13
3	WeightedEnsemble_L3	0.898884	0.904373	roc_auc	1.950996	0.346836	12.708249	0.002881	0.000468	0.047544	3	True	16
4	WeightedEnsemble_L2	0.898884	0.904373	roc_auc	1.951030	0.347147	12.733692	0.002915	0.000780	0.072987	2	True	12
5	RandomForestGini_BAG_L2	0.893446	0.870096	roc_auc	2.181535	0.558684	14.013689	0.097584	0.099343	0.795686	2	True	15
6	LightGBM_BAG_L2	0.890652	0.873962	roc_auc	2.166814	0.503379	14.220818	0.082863	0.044038	1.002816	2	True	14
7	LightGBM_BAG_L1	0.889478	0.879878	roc_auc	0.125779	0.039593	1.027498	0.125779	0.039593	1.027498	1	True	4
8	RandomForestEntr_BAG_L1	0.886981	0.889863	roc_auc	0.109879	0.099917	0.554269	0.109879	0.099917	0.554269	1	True	6
9	RandomForestGini_BAG_L1	0.885163	0.887874	roc_auc	0.108916	0.094058	0.790370	0.108916	0.094058	0.790370	1	True	5
10	NeuralNetFastAI_BAG_L1	0.883602	0.870056	roc_auc	1.084143	0.105842	4.759086	1.084143	0.105842	4.759086	1	True	10
11	ExtraTreesEntr_BAG_L1	0.880342	0.890401	roc_auc	0.099039	0.100863	0.540545	0.099039	0.100863	0.540545	1	True	9
12	ExtraTreesGini_BAG_L1	0.879143	0.895789	roc_auc	0.090939	0.101091	0.589453	0.090939	0.101091	0.589453	1	True	8
13	XGBoost_BAG_L1	0.864328	0.849670	roc_auc	0.490935	0.057321	0.887290	0.490935	0.057321	0.887290	1	True	11
14	KNeighborsDist_BAG_L1	0.525998	0.536956	roc_auc	0.025956	0.013057	0.003028	0.025956	0.013057	0.003028	1	True	2
15	KNeighborsUnif_BAG_L1	0.514970	0.519604	roc_auc	0.025847	0.015011	0.003547	0.025847	0.015011	0.003547	1	True	1

此命令实施以下策略以最大化准确性：

指定参数 presets='best_quality'，这允许 AutoGluon 基于 stacking/bagging 自动构建强大的模型集成，并且如果给予足够的训练时间，将大大改善预测结果。presets 的默认值是 'medium_quality'，它生成的模型精度较低，但有助于更快的原型设计。通过 presets，您可以灵活地优先考虑预测精度与训练/推理速度。例如，如果您不太关心预测性能并希望快速部署一个基本模型，请考虑使用：presets=['good_quality', 'optimize_for_deployment']。
如果您知道在您的应用程序中将使用什么指标来评估预测，请提供参数 eval_metric 给 TabularPredictor()。您可能使用的一些其他非默认指标包括：'f1'（用于二分类），'roc_auc'（用于二分类），'log_loss'（用于分类），'mean_absolute_error'（用于回归），'median_absolute_error'（用于回归）。您还可以定义自己的自定义指标函数。更多信息请参考 Adding a custom metric to AutoGluon。
将所有数据包含在train_data中，不要提供tuning_data（AutoGluon将更智能地分割数据以满足其需求）。
不要指定hyperparameter_tune_kwargs参数（反直觉的是，超参数调优并不是花费有限训练时间预算的最佳方式，因为模型集成通常更优）。我们建议您仅在目标是部署单个模型而不是集成时使用hyperparameter_tune_kwargs。
不要指定hyperparameters参数（允许AutoGluon自适应选择要使用的模型/超参数）。
将 time_limit 设置为您愿意等待的最长时间（以秒为单位）。AutoGluon 的预测性能会随着 fit() 运行时间的增加而提高。

回归（预测数值表列）：¶

为了证明fit()也可以自动处理回归任务，我们现在尝试基于其他特征预测同一表中的数值型age变量：

age_column = 'age'
train_data[age_column].head()

   51
  58
  40
  37
  62
Name: age, dtype: int64

我们再次调用fit()，这次设置了时间限制（以秒为单位），并且还演示了一种简写方法来在测试数据（包含标签）上评估生成的模型：

predictor_age = TabularPredictor(label=age_column, path="agModels-predictAge").fit(train_data, time_limit=60)

Show code cell output Hide code cell output

Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.2b20241127
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Tue Sep 24 10:00:37 UTC 2024
CPU Count:          8
Memory Avail:       27.78 GB / 30.95 GB (89.8%)
Disk Space Avail:   213.32 GB / 255.99 GB (83.3%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 60s
AutoGluon will save models to "/home/ci/autogluon/docs/tutorials/tabular/agModels-predictAge"
Train Data Rows:    500
Train Data Columns: 14
Label Column:       age
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (85, 17, 39.652, 13.52393)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       regression
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory:                    28449.23 MB
Train Data (Original)  Memory Usage: 0.31 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', [])    : 5 | ['fnlwgt', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
('object', []) : 9 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', [])  : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', [])       : 5 | ['fnlwgt', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
('int', ['bool']) : 2 | ['sex', 'class']
0.1s = Fit runtime
14 features in original data used to generate 14 features in processed data.
Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.08s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 400, Val Rows: 100
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 11 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ... Training model for up to 59.92s of the 59.92s of remaining time.
-15.6869	 = Validation score   (-root_mean_squared_error)
0.0s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: KNeighborsDist ... Training model for up to 59.90s of the 59.90s of remaining time.
-15.1801	 = Validation score   (-root_mean_squared_error)
0.0s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: LightGBMXT ... Training model for up to 59.88s of the 59.88s of remaining time.
-11.7092	 = Validation score   (-root_mean_squared_error)
0.32s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: LightGBM ... Training model for up to 59.55s of the 59.55s of remaining time.
-11.9295	 = Validation score   (-root_mean_squared_error)
0.3s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: RandomForestMSE ... Training model for up to 59.25s of the 59.25s of remaining time.
-11.6624	 = Validation score   (-root_mean_squared_error)
0.46s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: CatBoost ... Training model for up to 58.73s of the 58.73s of remaining time.
-11.7993	 = Validation score   (-root_mean_squared_error)
0.64s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: ExtraTreesMSE ... Training model for up to 58.08s of the 58.08s of remaining time.
-11.3627	 = Validation score   (-root_mean_squared_error)
0.41s	 = Training   runtime
0.05s	 = Validation runtime
Fitting model: NeuralNetFastAI ... Training model for up to 57.61s of the 57.61s of remaining time.
-11.9445	 = Validation score   (-root_mean_squared_error)
0.62s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: XGBoost ... Training model for up to 56.97s of the 56.97s of remaining time.
-11.5274	 = Validation score   (-root_mean_squared_error)
0.24s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: NeuralNetTorch ... Training model for up to 56.72s of the 56.72s of remaining time.
-11.9806	 = Validation score   (-root_mean_squared_error)
2.53s	 = Training   runtime
0.01s	 = Validation runtime
Fitting model: LightGBMLarge ... Training model for up to 54.17s of the 54.17s of remaining time.
-12.6926	 = Validation score   (-root_mean_squared_error)
0.61s	 = Training   runtime
0.0s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 59.92s of the 53.55s of remaining time.
Ensemble Weights: {'ExtraTreesMSE': 0.48, 'XGBoost': 0.2, 'NeuralNetFastAI': 0.16, 'NeuralNetTorch': 0.16}
-11.1781	 = Validation score   (-root_mean_squared_error)
0.01s	 = Training   runtime
0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 6.48s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 1397.7 rows/s (100 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/home/ci/autogluon/docs/tutorials/tabular/agModels-predictAge")

predictor_age.evaluate(test_data)

{'root_mean_squared_error': -10.474578513117216,
 'mean_squared_error': -109.71679502745684,
 'mean_absolute_error': -8.232829588391127,
 'r2': 0.41354238986968994,
 'pearsonr': 0.6459009870838579,
 'median_absolute_error': -6.8876953125}

请注意，我们不需要告诉AutoGluon这是一个回归问题，它会自动从数据中推断出这一点，并报告适当的性能指标（默认情况下是RMSE）。要指定除默认值之外的特定评估指标，请设置TabularPredictor()的eval_metric参数，AutoGluon将调整其模型以优化您的指标（例如eval_metric = 'mean_absolute_error'）。对于值越高越差的评估指标（如RMSE），AutoGluon将在训练期间翻转其符号并将其打印为负值（因为它内部假设值越高越好）。您甚至可以通过遵循自定义指标教程来指定自定义指标。

我们可以调用排行榜来查看每个模型的性能：

predictor_age.leaderboard(test_data)

	model	score_test	score_val	eval_metric	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	WeightedEnsemble_L2	-10.474579	-11.178103	root_mean_squared_error	0.355629	0.071548	3.806020	0.002689	0.000299	0.010767	2	True	12
1	ExtraTreesMSE	-10.655482	-11.362738	root_mean_squared_error	0.116864	0.046738	0.408615	0.116864	0.046738	0.408615	1	True	7
2	RandomForestMSE	-10.746175	-11.662354	root_mean_squared_error	0.132652	0.046742	0.455130	0.132652	0.046742	0.455130	1	True	5
3	CatBoost	-10.780312	-11.799279	root_mean_squared_error	0.009605	0.003969	0.641279	0.009605	0.003969	0.641279	1	True	6
4	LightGBMXT	-10.837373	-11.709228	root_mean_squared_error	0.060193	0.003458	0.316422	0.060193	0.003458	0.316422	1	True	3
5	XGBoost	-10.903558	-11.527441	root_mean_squared_error	0.051195	0.005982	0.241874	0.051195	0.005982	0.241874	1	True	9
6	LightGBM	-10.972156	-11.929546	root_mean_squared_error	0.020822	0.002979	0.296139	0.020822	0.002979	0.296139	1	True	4
7	NeuralNetTorch	-11.141885	-11.980640	root_mean_squared_error	0.048881	0.009617	2.527405	0.048881	0.009617	2.527405	1	True	10
8	NeuralNetFastAI	-11.343937	-11.944539	root_mean_squared_error	0.135999	0.008912	0.617358	0.135999	0.008912	0.617358	1	True	8
9	LightGBMLarge	-11.832441	-12.692643	root_mean_squared_error	0.034621	0.003274	0.609077	0.034621	0.003274	0.609077	1	True	11
10	KNeighborsUnif	-14.902058	-15.686937	root_mean_squared_error	0.031393	0.013270	0.003544	0.031393	0.013270	0.003544	1	True	1
11	KNeighborsDist	-15.771259	-15.180149	root_mean_squared_error	0.026006	0.013186	0.003233	0.026006	0.013186	0.003233	1	True	2

数据格式： AutoGluon 目前可以操作已经加载到 Python 中作为 pandas DataFrames 的数据表，或者存储在 CSV 格式或 Parquet 格式文件中的数据。如果你的数据分布在多个表中，你需要首先将它们连接成一个单一的表，其中行对应于统计上独立的观测值（数据点），列对应于不同的特征（也称为变量/协变量）。

请参阅TabularPredictor 文档以查看所有可用的方法/选项。

高级用法¶

有关AutoGluon的更高级用法示例，请参阅In Depth Tutorial

如果您对部署优化感兴趣，请参考部署优化教程。

要将自定义模型添加到AutoGluon，请参考自定义模型和自定义模型高级教程。