预测区间

!pip install -Uqq nixtla

from nixtla.utils import in_colab

IN_COLAB = in_colab()

if not IN_COLAB:
    from nixtla.utils import colab_badge
    from itertools import product
    from fastcore.test import test_eq, test_fail, test_warns
    from dotenv import load_dotenv

在预测中，我们通常对预测的分布感兴趣，而不仅仅是一个点预测，因为我们希望对预测周围的不确定性有一个概念。

为此，我们可以创建_预测区间_。

预测区间具有直观的解释，因为它们展示了预测分布的特定范围。例如，95%预测区间意味着在100次中，有95次我们预期未来值会落在估计范围内。因此，区间越宽，表明对预测的不确定性越大，而区间越窄，则表示信心越高。

使用TimeGPT，我们可以创建预测的分布，并提取所需水平的预测区间。

TimeGPT使用符合预测来生成预测区间。

if not IN_COLAB:
    load_dotenv()    
    colab_badge('docs/tutorials/11_uncertainty_quantification_with_prediction_intervals')

1. 导入包

首先，我们导入所需的包并初始化Nixtla客户端。

import pandas as pd
from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍 使用 Azure AI 端点

要使用 Azure AI 端点，请设置 base_url 参数：

nixtla_client = NixtlaClient(base_url="你的 Azure AI 端点", api_key="你的 api_key")

if not IN_COLAB:
    nixtla_client = NixtlaClient()

2. 加载数据

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()

	timestamp	value
0	1949-01-01	112
1	1949-02-01	118
2	1949-03-01	132
3	1949-04-01	129
4	1949-05-01	121

3. 带预测区间的预测

使用 TimeGPT 进行时间序列预测时，您可以根据需求设置预测区间的水平（或多个水平）。以下是您可以怎么做：

timegpt_fcst_pred_int_df = nixtla_client.forecast(
    df=df, h=12, level=[80, 90, 99.7], 
    time_col='timestamp', target_col='value',
)
timegpt_fcst_pred_int_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

	timestamp	TimeGPT	TimeGPT-lo-99.7	TimeGPT-lo-90	TimeGPT-lo-80	TimeGPT-hi-80	TimeGPT-hi-90	TimeGPT-hi-99.7
0	1961-01-01	437.837952	415.826484	423.783737	431.987091	443.688812	451.892166	459.849419
1	1961-02-01	426.062744	402.833553	407.694092	412.704956	439.420532	444.431396	449.291935
2	1961-03-01	463.116577	423.434092	430.316893	437.412564	488.820590	495.916261	502.799062
3	1961-04-01	478.244507	444.885193	446.776764	448.726837	507.762177	509.712250	511.603821
4	1961-05-01	505.646484	465.736694	471.976787	478.409872	532.883096	539.316182	545.556275

📘 Azure AI中的可用模型

如果您正在使用Azure AI端点，请确保设置 model="azureai"：

nixtla_client.forecast(..., model="azureai")

对于公共API，我们支持两个模型：timegpt-1 和 timegpt-1-long-horizon。

默认情况下，使用timegpt-1。有关如何以及何时使用 timegpt-1-long-horizon，请参见本教程。

# 测试较短的时间范围
if not IN_COLAB:
    level_short_horizon_df = nixtla_client.forecast(
        df=df, h=6, level=[80, 90, 99.7], 
        time_col='timestamp', target_col='value',
    )
    test_eq(
        level_short_horizon_df.shape,
        (6, 8)
    )

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

if not IN_COLAB:
    test_level = [80, 90.5]
    cols_fcst_df = nixtla_client.forecast(
        df=df, h=12, level=test_level, 
        time_col='timestamp', target_col='value',
    ).columns
    assert all(f'TimeGPT-{pos}-{lv}' in cols_fcst_df for pos, lv in product(['lo', 'hi'], test_level) )

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

nixtla_client.plot(
    df, timegpt_fcst_pred_int_df, 
    time_col='timestamp', target_col='value',
    level=[80, 90],
)

必须注意的是，预测区间水平的选择取决于您的具体用例。对于高风险的预测，您可能希望使用更宽的区间以考虑更多的不确定性。对于不那么关键的预测，较窄的区间可能是可以接受的。

历史预测

您还可以通过添加 add_history=True 参数来计算历史预测的预测区间，如下所示：

timegpt_fcst_pred_int_historical_df = nixtla_client.forecast(
    df=df, h=12, level=[80, 90], 
    time_col='timestamp', target_col='value',
    add_history=True,
)
timegpt_fcst_pred_int_historical_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Calling Historical Forecast Endpoint...

	timestamp	TimeGPT	TimeGPT-lo-80	TimeGPT-lo-90	TimeGPT-hi-80	TimeGPT-hi-90
0	1951-01-01	135.483673	111.937767	105.262830	159.029579	165.704516
1	1951-02-01	144.442413	120.896508	114.221571	167.988319	174.663256
2	1951-03-01	157.191910	133.646004	126.971067	180.737815	187.412752
3	1951-04-01	148.769379	125.223473	118.548536	172.315284	178.990221
4	1951-05-01	140.472946	116.927041	110.252104	164.018852	170.693789

nixtla_client.plot(
    df, timegpt_fcst_pred_int_historical_df, 
    time_col='timestamp', target_col='value',
    level=[80, 90],
)

Give us a ⭐ on Github