多序列预测

!pip install -Uqq nixtla

from nixtla.utils import in_colab

IN_COLAB = in_colab()

if not IN_COLAB:
    from nixtla.utils import colab_badge
    from dotenv import load_dotenv

TimeGPT 提供了一个强大的多序列预测解决方案，这涉及到同时分析多个数据序列，而不是单一序列。该工具可以使用大量的序列进行微调，从而使您能够根据特定的需求或任务来定制模型。

请注意，预测仍然是单变量的。这意味着尽管 TimeGPT 是一个全球模型，但它不会考虑目标序列内的特征间关系。然而，TimeGPT 确实支持使用外生变量，例如分类变量（例如，类别，品牌）、数值变量（例如，温度，价格）或甚至特殊假期。

让我们看看这个实际应用。

if not IN_COLAB:
    load_dotenv()    
    colab_badge('docs/tutorials/05_multiple_series')

1. 导入包

首先，我们安装并导入所需的包，并初始化 Nixtla 客户端。

如往常一样，我们首先初始化一个 NixtlaClient 实例。

import pandas as pd
from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍 使用 Azure AI 端点

要使用 Azure AI 端点，请记得还要设置 base_url 参数：

nixtla_client = NixtlaClient(base_url="你的 Azure AI 端点", api_key="你的 api_key")

if not IN_COLAB:
    nixtla_client = NixtlaClient()

2. 加载数据

以下数据集包含了欧洲不同电力市场的价格。

在TimeGPT中，使用unique_id列自动检测多个序列。该列包含每个序列的标签。如果该列中有多个唯一值，则系统会知道这是一个多序列场景。

在这个具体案例中，unique_id列包含了值BE、DE、FR、JPM和NP。

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv')
df.head()

	unique_id	ds	y
0	BE	2016-12-01 00:00:00	72.00
1	BE	2016-12-01 01:00:00	65.80
2	BE	2016-12-01 02:00:00	59.99
3	BE	2016-12-01 03:00:00	50.69
4	BE	2016-12-01 04:00:00	52.58

让我们使用 NixtlaClient 绘制这个系列：

nixtla_client.plot(df)

3. 多重序列预测

要一次性预测所有序列，我们只需将数据框传递给df参数。TimeGPt将自动预测所有序列。

timegpt_fcst_multiseries_df = nixtla_client.forecast(df=df, h=24, level=[80, 90])
timegpt_fcst_multiseries_df.head()

INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: H
INFO:nixtlats.nixtla_client:Restricting input...
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...

	unique_id	ds	TimeGPT	TimeGPT-lo-90	TimeGPT-lo-80	TimeGPT-hi-80	TimeGPT-hi-90
0	BE	2016-12-31 00:00:00	46.151176	36.660478	38.337019	53.965334	55.641875
1	BE	2016-12-31 01:00:00	42.426598	31.602231	33.976724	50.876471	53.250964
2	BE	2016-12-31 02:00:00	40.242889	30.439970	33.634985	46.850794	50.045809
3	BE	2016-12-31 03:00:00	38.265339	26.841481	31.022093	45.508585	49.689197
4	BE	2016-12-31 04:00:00	36.618801	18.541384	27.981346	45.256256	54.696218

📘 Azure AI 中可用的模型

如果您正在使用 Azure AI 终端，请确保设置 model="azureai"：

nixtla_client.forecast(..., model="azureai")

对于公共 API，我们支持两个模型：timegpt-1 和 timegpt-1-long-horizon。

默认情况下，使用的是 timegpt-1。请参见本教程了解如何以及何时使用 timegpt-1-long-horizon。

nixtla_client.plot(df, timegpt_fcst_multiseries_df, max_insample_length=365, level=[80, 90])

从上面的图中，我们可以看到模型有效地为数据集中每个独特的序列生成了预测。

历史预测

您还可以通过添加 add_history=True 来计算历史预测的预测区间。

为了指定置信区间，我们使用 level 参数。在这里，我们传递列表 [80, 90]。这将计算80%和90%的置信区间。

timegpt_fcst_multiseries_with_history_df = nixtla_client.forecast(df=df, h=24, level=[80, 90], add_history=True)
timegpt_fcst_multiseries_with_history_df.head()

INFO:nixtlats.nixtla_client:Validating inputs...
INFO:nixtlats.nixtla_client:Preprocessing dataframes...
INFO:nixtlats.nixtla_client:Inferred freq: H
INFO:nixtlats.nixtla_client:Calling Forecast Endpoint...
INFO:nixtlats.nixtla_client:Calling Historical Forecast Endpoint...

	unique_id	ds	TimeGPT	TimeGPT-lo-80	TimeGPT-lo-90	TimeGPT-hi-80	TimeGPT-hi-90
0	BE	2016-12-06 00:00:00	55.756332	42.066476	38.185593	69.446188	73.327072
1	BE	2016-12-06 01:00:00	52.820206	39.130350	35.249466	66.510062	70.390946
2	BE	2016-12-06 02:00:00	46.851070	33.161214	29.280331	60.540926	64.421810
3	BE	2016-12-06 03:00:00	50.640892	36.951036	33.070152	64.330748	68.211632
4	BE	2016-12-06 04:00:00	52.420410	38.730554	34.849670	66.110266	69.991150

📘 Azure AI 中可用的模型

如果您使用的是 Azure AI 端点，请确保设置 model="azureai"：

nixtla_client.forecast(..., model="azureai")

对于公共 API，我们支持两个模型：timegpt-1 和 timegpt-1-long-horizon。

默认情况下使用 timegpt-1。有关何时以及如何使用 timegpt-1-long-horizon 的信息，请参见此教程。

nixtla_client.plot(
    df, 
    timegpt_fcst_multiseries_with_history_df.groupby('unique_id').tail(365 + 24), 
    max_insample_length=365, 
    level=[80, 90],
)

在上图中，我们现在可以看到TimeGPT对每个系列所做的历史预测，以及80%和90%的置信区间。

Give us a ⭐ on Github