.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/linear_model/plot_lasso_dense_vs_sparse_data.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_linear_model_plot_lasso_dense_vs_sparse_data.py: ============================== 稠密数据和稀疏数据上的Lasso回归 ============================== 我们展示了linear_model.Lasso在稠密数据和稀疏数据上提供相同的结果,并且在稀疏数据的情况下速度有所提升。 .. GENERATED FROM PYTHON SOURCE LINES 9-17 .. code-block:: Python from time import time from scipy import linalg, sparse from sklearn.datasets import make_regression from sklearn.linear_model import Lasso .. GENERATED FROM PYTHON SOURCE LINES 18-22 比较两种 Lasso 实现方法在稠密数据上的表现 ----------------------------------------------------- 我们创建了一个适合Lasso的线性回归问题,也就是说,特征数量多于样本数量。然后我们将数据矩阵分别存储为密集格式(通常的格式)和稀疏格式,并在每种格式上训练一个Lasso。我们计算了两者的运行时间,并通过计算它们学习到的系数之间差异的欧几里得范数来检查它们是否学习到了相同的模型。由于数据是密集的,我们预计使用密集数据格式会有更好的运行时间。 .. GENERATED FROM PYTHON SOURCE LINES 22-43 .. code-block:: Python X, y = make_regression(n_samples=200, n_features=5000, random_state=0) # 创建 X 的稀疏格式副本 X_sp = sparse.coo_matrix(X) alpha = 1 sparse_lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=1000) dense_lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=1000) t0 = time() sparse_lasso.fit(X_sp, y) print(f"Sparse Lasso done in {(time() - t0):.3f}s") t0 = time() dense_lasso.fit(X, y) print(f"Dense Lasso done in {(time() - t0):.3f}s") # 比较回归系数 coeff_diff = linalg.norm(sparse_lasso.coef_ - dense_lasso.coef_) print(f"Distance between coefficients : {coeff_diff:.2e}") # .. rst-class:: sphx-glr-script-out .. code-block:: none Sparse Lasso done in 0.287s Dense Lasso done in 0.028s Distance between coefficients : 1.01e-13 .. GENERATED FROM PYTHON SOURCE LINES 44-48 比较两种 Lasso 实现方法在稀疏数据上的表现 ------------------------------------------------------ 我们通过将所有小值替换为0来使前一个问题变得稀疏,并运行与上述相同的比较。由于数据现在是稀疏的,我们预计使用稀疏数据格式的实现会更快。 .. GENERATED FROM PYTHON SOURCE LINES 48-76 .. code-block:: Python # 复制之前的数据 Xs = X.copy() # 通过将小于2.5的值替换为0,使Xs稀疏化 Xs[Xs < 2.5] = 0.0 # 创建 Xs 的稀疏格式副本 Xs_sp = sparse.coo_matrix(Xs) Xs_sp = Xs_sp.tocsc() # 计算数据矩阵中非零系数的比例 print(f"Matrix density : {(Xs_sp.nnz / float(X.size) * 100):.3f}%") alpha = 0.1 sparse_lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000) dense_lasso = Lasso(alpha=alpha, fit_intercept=False, max_iter=10000) t0 = time() sparse_lasso.fit(Xs_sp, y) print(f"Sparse Lasso done in {(time() - t0):.3f}s") t0 = time() dense_lasso.fit(Xs, y) print(f"Dense Lasso done in {(time() - t0):.3f}s") # 比较回归系数 coeff_diff = linalg.norm(sparse_lasso.coef_ - dense_lasso.coef_) print(f"Distance between coefficients : {coeff_diff:.2e}") .. rst-class:: sphx-glr-script-out .. code-block:: none Matrix density : 0.626% Sparse Lasso done in 0.347s Dense Lasso done in 0.687s Distance between coefficients : 8.06e-12 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.384 seconds) .. _sphx_glr_download_auto_examples_linear_model_plot_lasso_dense_vs_sparse_data.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/linear_model/plot_lasso_dense_vs_sparse_data.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_lasso_dense_vs_sparse_data.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_lasso_dense_vs_sparse_data.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_lasso_dense_vs_sparse_data.zip ` .. include:: plot_lasso_dense_vs_sparse_data.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_