`scikit-learn`-兼容的机器学习方法#

类#

`skmodel`
`PortfolioSelection`	使用`skscope`构建一个稀疏投资组合，使用`MinVar`或`MeanVar`度量。
`NonlinearSelection`	选择可能与目标具有非线性依赖关系的相关特征。
`RobustRegression`	通过稀疏约束指数损失最小化的稳健回归过程。
`MultivariateFailure`	多元失效时间模型。
`IsotonicRegression`	等渗回归。

class skscope.skmodel.IsotonicRegression(sparsity=5)[来源]#

等渗回归。

Parameters:: 稀疏度 (int, default=5) – 要选择的特征数量，即稀疏度级别。

fit(X, y, sample_weight=None)[来源]#

使用X, y作为训练数据来拟合模型。

Parameters:

X (数组形式，形状为 (n_samples,) 或 (n_samples, 1)) – 训练数据。
y (数组形式，形状为 (n_samples,)) – 训练目标。
sample_weight (array-like of shape (n_samples,), default=None) – 权重。如果设置为None，所有权重将设置为1（相等权重）。

Returns:

self – 返回自身的实例。

Return type:

对象

predict(X)[来源]#

通过线性插值预测新数据。

Parameters:: X (数组形式，形状为 (n_samples,) 或 (n_samples, 1)) – 要转换的数据。
Returns:: y_pred – 转换后的数据。
Return type:: 形状为 (n_samples,) 的 ndarray

score(X, y, sample_weight=None)[来源]#

返回预测的确定系数。

决定系数\(R^2\)定义为\((1 - \frac{u}{v})\)，其中\(u\)是残差平方和((y_true - y_pred)** 2).sum()，而\(v\)是总平方和((y_true - y_true.mean()) ** 2).sum()。最佳可能得分为1.0，且可能为负（因为模型可能任意更差）。一个总是预测y期望值的常数模型，忽略输入特征，将获得\(R^2\)得分为0.0。

Parameters:

X (array-like of shape (n_samples, n_features)) – 测试样本。对于某些估计器，这可能是一个预计算的核矩阵或一个通用对象的列表，其形状为 (n_samples, n_samples_fitted)，其中 n_samples_fitted 是用于估计器拟合的样本数量。
y (数组形式，形状为 (n_samples,) 或 (n_samples, n_outputs)) – X的真实值。
sample_weight (array-like of shape (n_samples,), default=None) – 样本权重。

Returns:

score – \(R^2\) 的 self.predict(X) 相对于 y。

Return type:

浮点数

transform(X)[来源]#

通过线性插值转换新数据。

Parameters:: X (数组形式，形状为 (n_samples,) 或 (n_samples, 1)) – 要转换的数据。
Returns:: y_pred – 转换后的数据。
Return type:: 形状为 (n_samples,) 的 ndarray

class skscope.skmodel.MultivariateFailure(sparsity=5)[来源]#

多元失效时间模型。

Parameters:: 稀疏度 (int, default=5) – 要选择的特征数量，即稀疏度级别。

fit(X, y, delta, sample_weight=None)[来源]#

最小化提供的数据的负部分对数似然，并带有稀疏性约束。

Parameters:

X (array-like, shape = (n_samples, n_features)) – 数据矩阵
y (数组形式, 形状 = (n_samples, n_events)) – 多个事件的观察时间。
delta (array-like, shape = (n_samples, n_events)) – 删失的指示矩阵。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

self – 已拟合的估计器。

Return type:

对象

predict(X)[来源]#

给定特征，预测与样本无关的某个常数之前的风险函数。

Parameters:: X (数组形式, 形状(n_samples, n_features)) – 特征矩阵。
Returns:: 风险 – 与风险函数成比例的量 \(e^{\beta^{\top}X_i}\)，直到与样本索引 \(i\) 无关的某个常数，使得 \(\lambda_k(t;X_{i})=\lambda_{0k}(t)e^{\beta^{\top}X_i}\)。
Return type:: 数组, 形状 = (n_samples,)

score(X, y, delta, sample_weight=None)[来源]#

提供测试数据，它将返回此拟合模型的测试分数。

Parameters:

X (array-like, shape = (n_samples, n_features)) – 数据矩阵
y (数组形式, 形状 = (n_samples, n_events)) – 多个事件的观察时间。
delta (array-like, shape = (n_samples, n_events)) – 删失的指示矩阵。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

score – 给定数据的对数似然。

Return type:

浮点数

class skscope.skmodel.NonlinearSelection(sparsity=1, gamma_x=0.7, gamma_y=0.7)[来源]#

选择可能与目标具有非线性依赖关系的相关特征。

Parameters:

sparsity (int, default=5) – 要选择的特征数量，即稀疏度水平。
gamma_x (float, default=0.7) – X的高斯核宽度参数。
gamma_y (float, default=0.7) – y的高斯核宽度参数。

fit(X, y, sample_weight=None)[来源]#

fit函数用于计算系数向量coef_，那些对应较大系数的特征被认为对目标有更强的依赖性。

Parameters:

X (array-like of shape (n_samples, n_features)) – 特征矩阵。
y (数组形式，形状为 (n_samples,)) – 目标值。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

self – 已拟合的估计器。

Return type:

对象

score(X, y, sample_weight=None)[来源]#

提供测试数据，它将返回此拟合模型的测试分数。

Parameters:

X (array-like, shape(n_samples, n_features)) – 特征矩阵。
y (array-like, shape(n_samples,)) – 目标值。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

score – 给定数据上的负损失。

Return type:

浮点数

class skscope.skmodel.PortfolioSelection(sparsity=1, obj='MinVar', alpha=0, cov_matrix='lw', random_state=None)[来源]#

使用skscope构建一个稀疏投资组合，采用MinVar或MeanVar度量。

Parameters:

sparsity (int, default=10) – 选择的股票数量，即稀疏度水平
obj ({"MinVar", "MeanVar"}, default="MinVar") – 投资组合优化的目标
alpha (float, default=0) – 回报的惩罚系数
cov_matrix ({"empirical", "lw"}, default="lw") – 指定协方差矩阵的估计器。如果 empirical，则使用经验估计器。如果 lw，则使用 LedoitWolf 估计器。
random_state ({None, int, array_like[ints], SeedSequence, BitGenerator, Generator}, default=None) – 用于初始化ScopeSolver中init_params参数的种子

fit(X, y=None, sample_weight=None)[来源]#

fit 函数用于计算具有特定目标的所需稀疏投资组合的权重。

Parameters:

X (array-like of shape (n_periods, n_assets)) – 跨越n_periods周期的n_assets资产的回报数据
y (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

self – 已拟合的估计器。

Return type:

对象

score(X, y=None, sample_weight=None, measure='Sharpe')[来源]#

给定数据，它返回使用权重self.coef_构建的投资组合的夏普比率。

Parameters:

X (array-like of shape (n_periods, n_assets)) – 跨越n_periods周期的n_assets资产的回报数据
y (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。
measure ({"Sharpe"}, default="Sharpe") – 衡量投资组合表现的指标。

Returns:

score – 构建的投资组合的夏普比率。

Return type:

浮点数

class skscope.skmodel.RobustRegression(sparsity=1, gamma=1)[来源]#

通过稀疏约束的指数损失最小化进行稳健回归的过程。具体来说，RobustRegression 解决了以下问题： \(\min_{\beta}-\sum_{i=1}^n\exp\{-(y_i-x_i^{\top}\beta)^2/\gamma\} \text{ s.t. } \|\beta\|_0 \leq s\) 其中 \(\gamma\) 是控制稳健程度的超参数， \(s\) 是控制 \(\beta\) 稀疏水平的超参数。

注意：当\(\gamma\)较大时，指数损失近似等于\(|y_i-x_i^{\top}\beta|^2/\gamma\)，因此与最小二乘估计器相似。当\(\gamma\)较小时，误差\(|y_i-x_i^{\top}\beta|\)较大的样本\(i\)对\(\beta\)的估计影响较小，从而限制了异常值的影响（即提高了鲁棒性但降低了效率）。因此，\(\gamma\)需要根据数据的先验知识或通过一些数据驱动的方法（例如交叉验证）仔细选择，以在估计器的鲁棒性和效率之间实现适当的权衡。

Parameters:

sparsity (int, default=1) – 要选择的特征数量，即稀疏度级别。
gamma (float, default=1) – 控制估计器鲁棒性程度的参数。

fit(X, y=None, sample_weight=None)[来源]#

fit函数用于计算系数向量coef_。

Parameters:

X (array-like of shape (n_samples, n_features)) – 特征矩阵。
y (数组形式，形状为 (n_samples,)) – 目标值。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

self – 已拟合的估计器。

Return type:

对象

score(X, y, sample_weight=None)[来源]#

提供测试数据，它将返回此拟合模型的测试分数。

Parameters:

X (array-like, shape(n_samples, n_features)) – 特征矩阵。
y (array-like, shape(n_samples,)) – 目标值。
sample_weight (忽略) – 未使用，此处仅为遵循API一致性惯例而存在。

Returns:

score – 给定数据的加权指数损失。

Return type:

浮点数

scikit-learn-兼容的机器学习方法#

类#

`scikit-learn`-兼容的机器学习方法#