sparse_encode#

sklearn.decomposition.sparse_encode(X, dictionary, *, gram=None, cov=None, algorithm='lasso_lars', n_nonzero_coefs=None, alpha=None, copy_cov=True, init=None, max_iter=1000, n_jobs=None, check_input=True, verbose=0, positive=False)#

稀疏编码。

结果的每一行都是稀疏编码问题的解。目标是找到一个稀疏数组 code ，使得:

X ~= code * dictionary

更多信息请参阅用户指南。

Parameters:

X形状为 (n_samples, n_features) 的类数组

数据矩阵。

dictionary形状为 (n_components, n_features) 的类数组

用于解决数据稀疏编码的词典矩阵。某些算法假设行已归一化以输出有意义的结果。

gram形状为 (n_components, n_components) 的类数组，默认=None

预计算的 Gram 矩阵， dictionary * dictionary' 。

cov形状为 (n_components, n_samples) 的类数组，默认=None

预计算的协方差， dictionary' * X 。

algorithm{‘lasso_lars’, ‘lasso_cd’, ‘lars’, ‘omp’, ‘threshold’}, 默认=’lasso_lars’

使用的算法：

'lars' : 使用最小角回归方法 ( linear_model.lars_path )；
'lasso_lars' : 使用 Lars 计算 Lasso 解；
'lasso_cd' : 使用坐标下降法计算 Lasso 解 ( linear_model.Lasso )。如果估计的成分是稀疏的，lasso_lars 会更快；
'omp' : 使用正交匹配追踪估计稀疏解；
'threshold' : 将投影 dictionary * data' 中小于正则化的所有系数压缩为零。

n_nonzero_coefsint, 默认=None

目标在解的每一列中的非零系数数量。这仅由 algorithm='lars' 和 algorithm='omp' 使用，并且在 omp 情况下被 alpha 覆盖。如果为 None ，则 n_nonzero_coefs=int(n_features / 10) 。

alphafloat, 默认=None

如果 algorithm='lasso_lars' 或 algorithm='lasso_cd' ， alpha 是应用于 L1 范数的惩罚。如果 algorithm='threshold' ， alpha 是系数将被压缩为零的阈值的绝对值。如果 algorithm='omp' ， alpha 是容差参数：目标的重建误差值。在这种情况下，它覆盖 n_nonzero_coefs 。如果为 None ，默认值为 1。

copy_covbool, 默认=True

是否复制预计算的协方差矩阵；如果为 False ，可能会被覆盖。

init形状为 (n_samples, n_components) 的 ndarray，默认=None

稀疏代码的初始化值。仅在 algorithm='lasso_cd' 时使用。

max_iterint, 默认=1000

如果 algorithm='lasso_cd' 或 'lasso_lars' ，执行的最大迭代次数。

n_jobsint, 默认=None

并行运行的作业数。 None 表示 1，除非在 joblib.parallel_backend 上下文中。 -1 表示使用所有处理器。有关更多详细信息，请参阅 Glossary 。

check_inputbool, 默认=True

如果为 False ，则不会检查输入数组 X 和 dictionary。

verboseint, 默认=0

控制详细程度；越高，消息越多。

positivebool, 默认=False

在寻找编码时是否强制正性。

Added in version 0.20.

Returns:

code形状为 (n_samples, n_components) 的 ndarray: 稀疏代码。