概率预测

量化不确定性

概率预测是量化目标变量未来不确定性的自然答案。这个任务需要建模以下条件预测分布：

\[\mathbb{P}(\mathbf{y}_{t+1:t+H} \;|\; \mathbf{y}_{:t})\]

我们将向你展示如何通过结合经典的长短期记忆网络 (LSTM) 和神经层次插值 (NHITS) 以及多分位损失函数 (MQLoss) 来解决这个任务。

\[ \mathrm{MQLoss}(y_{\tau}, [\hat{y}^{(q1)}_{\tau},\hat{y}^{(q2)}_{\tau},\dots,\hat{y}^{(Q)}_{\tau}]) = \frac{1}{H} \sum_{q} \mathrm{QL}(y_{\tau}, \hat{y}^{(q)}_{\tau}) \]

在这个笔记本中我们将：
1. 安装 NeuralForecast 库
2. 探索 M4-每小时数据。
3. 训练 LSTM 和 NHITS
4. 可视化 LSTM/NHITS 预测区间。

您可以使用Google Colab通过GPU运行这些实验。

1. 安装 NeuralForecast

%%capture
!pip install neuralforecast

有用的函数

下面定义的 plot_grid 辅助函数将在绘制不同时间序列和不同模型的预测时非常有用。

import random
import warnings
warnings.filterwarnings("ignore")
from itertools import product
import matplotlib.pyplot as plt

def plot_grid(df_train, df_test=None, plot_random=True, model=None, level=None):
    fig, axes = plt.subplots(4, 2, figsize = (24, 14))

    unique_ids = df_train['unique_id'].unique()

    assert len(unique_ids) >= 8, "Must provide at least 8 ts"
    
    if plot_random:
        unique_ids = random.sample(list(unique_ids), k=8)
    else:
        unique_uids = unique_ids[:8]

    for uid, (idx, idy) in zip(unique_ids, product(range(4), range(2))):
        train_uid = df_train.query('unique_id == @uid')
        axes[idx, idy].plot(train_uid['ds'], train_uid['y'], label = 'y_train')
        if df_test is not None:
            max_ds = train_uid['ds'].max()
            test_uid = df_test.query('unique_id == @uid')
            for col in ['y', f'{model}-median', 'y_test']:
                if col in test_uid:
                    axes[idx, idy].plot(test_uid['ds'], test_uid[col], label=col)
            if level is not None:
                for l, alpha in zip(sorted(level), [0.5, .4, .35, .2]):
                    axes[idx, idy].fill_between(
                        test_uid['ds'], 
                        test_uid[f'{model}-lo-{l}'], 
                        test_uid[f'{model}-hi-{l}'],
                        alpha=alpha,
                        color='orange',
                        label=f'{model}_level_{l}',
                    )
        axes[idx, idy].set_title(f'M4 Hourly: {uid}')
        axes[idx, idy].set_xlabel('Timestamp [t]')
        axes[idx, idy].set_ylabel('Target')
        axes[idx, idy].legend(loc='upper left')
        axes[idx, idy].xaxis.set_major_locator(plt.MaxNLocator(20))
        axes[idx, idy].grid()
    fig.subplots_adjust(hspace=0.5)
    plt.show()

2. 加载 M4 数据

出于测试目的，我们将使用M4竞争中的每小时数据集。

%%capture
!wget https://auto-arima-results.s3.amazonaws.com/M4-Hourly.csv
!wget https://auto-arima-results.s3.amazonaws.com/M4-Hourly-test.csv

import pandas as pd

Y_train_df = pd.read_csv('M4-Hourly.csv')
Y_test_df = pd.read_csv('M4-Hourly-test.csv').rename(columns={'y': 'y_test'})

在这个例子中，我们将使用数据的一个子集以避免等待太久。如果需要，您可以修改序列的数量。

n_series = 8
uids = Y_train_df['unique_id'].unique()[:n_series]
Y_train_df = Y_train_df.query('unique_id in @uids')
Y_test_df = Y_test_df.query('unique_id in @uids')

plot_grid(Y_train_df, Y_test_df)

3. 模型训练

core.NeuralForecast 提供了一个高层次的接口，方便我们使用一系列PyTorch模型。 NeuralForecast 通过一个模型列表 models=[LSTM(...), NHITS(...)] 来实例化，这些模型经过配置用于预测任务。

horizon 参数控制预测的前进步数，在此示例中为48小时（2天）。
MQLoss 结合 levels=[80,90] 专门化网络的输出，形成80%和90%的预测区间。
max_steps=2000 控制网络训练的持续时间。

有关网络实例化的更多详细信息，请查看他们的文档。

from neuralforecast import NeuralForecast
from neuralforecast.losses.pytorch import MQLoss
from neuralforecast.models import LSTM, NHITS

horizon = 48
levels = [80, 90]
models = [LSTM(input_size=3*horizon, h=horizon,
               loss=MQLoss(level=levels), max_steps=1000),
          NHITS(input_size=7*horizon, h=horizon,
                n_freq_downsample=[24, 12, 1],
                loss=MQLoss(level=levels), max_steps=2000),]
nf = NeuralForecast(models=models, freq='H')

Global seed set to 1
Global seed set to 1

库中的所有模型都是全局的，这意味着在共享优化过程中，会使用Y_train_df中的所有时间序列来训练一个具有共享参数的单一模型。这是深度学习模型预测文献中最常见的做法，被称为“交叉学习”。

%%capture
nf.fit(df=Y_train_df)

Y_hat_df = nf.predict()
Y_hat_df = Y_hat_df.reset_index()
Y_hat_df.head()

Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00,  2.82it/s]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 136.22it/s]

	unique_id	ds	LSTM-median	LSTM-lo-90	LSTM-lo-80	LSTM-hi-80	LSTM-hi-90	NHITS-median	NHITS-lo-90	NHITS-lo-80	NHITS-hi-80	NHITS-hi-90
0	H1	701	603.491211	526.534119	544.686646	650.893799	673.805603	611.634888	575.999146	582.778687	677.277039	674.705872
1	H1	702	548.415710	438.868591	472.805237	608.017822	639.063293	569.997803	513.014282	518.707153	598.849609	616.793457
2	H1	703	502.010681	382.608643	411.710419	570.315308	608.669250	510.787628	454.184448	465.425232	538.964172	554.563354
3	H1	704	460.870483	339.368988	370.636719	544.232666	579.824402	478.482330	429.657104	452.395508	500.892090	502.507141
4	H1	705	436.451843	313.868744	343.514191	520.812988	559.734741	463.763611	432.906342	427.853577	486.854492	487.539062

Y_test_df = Y_test_df.merge(Y_hat_df, how='left', on=['unique_id', 'ds'])

4. 绘制预测结果

在这里，我们通过绘制预测区间来完成我们的分析，并验证 LSTM 和 NHITS 都给出了优秀的结果。

考虑输出 [NHITS-lo-90.0、NHITS-hi-90.0]，这代表 NHITS 网络的80%预测区间；其下限给出第5百分位数（或0.05分位数），而其上限给出第95百分位数（或0.95分位数）。对于训练良好的模型，我们期望目标值在90%的时间内位于该区间内。

长短期记忆网络 (LSTM)

plot_grid(Y_train_df, Y_test_df, level=levels, model='LSTM')

NHITS

plot_grid(Y_train_df, Y_test_df, level=levels, model='NHITS')

参考文献

Give us a ⭐ on Github