入门指南#
安装#
如果您在Windows上工作,您需要首先安装PyTorch
pip install torch -f https://download.pytorch.org/whl/torch_stable.html
否则,您可以继续
pip install pytorch-forecasting
或者,通过 conda
安装包:
conda install pytorch-forecasting pytorch>=1.7 -c pytorch -c conda-forge
PyTorch Forecasting 现在从 conda-forge 频道安装,而 PyTorch 则从 pytorch 频道安装。
要使用MQF2损失(多元分位数损失),还需安装
pip install pytorch-forecasting[mqf2]
用法#
该库在很大程度上依赖于PyTorch Lightning,这使得模型训练变得简单,能够快速发现错误,并且开箱即用地在多个GPU上进行训练。
此外,我们依赖Tensorboard来记录训练进度。
训练和测试模型的一般设置是
使用
TimeSeriesDataSet
创建训练数据集。使用训练数据集,通过
from_dataset()
创建一个验证数据集。 同样地,可以创建一个测试数据集或稍后用于推理的数据集。如果您不希望在进行推理时加载整个训练数据集,可以直接存储数据集参数。使用其
.from_dataset()
方法实例化模型。创建一个
lightning.Trainer()
对象。使用其
.tuner.lr_find()
方法找到最佳学习率。在训练数据集上使用早停法训练模型,并使用tensorboard日志来了解模型是否以可接受的精度收敛。
使用您最喜欢的包调整模型的超参数。
在整个数据集上使用相同的学习率计划训练模型。
从模型检查点加载模型并将其应用于新数据。
Tutorials 部分提供了关于如何使用模型和实现新模型的详细指导和示例。
示例#
import lightning.pytorch as pl
from lightning.pytorch.callbacks import EarlyStopping, LearningRateMonitor
from lightning.pytorch.tuner import Tuner
from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer
# load data
data = ...
# define dataset
max_encoder_length = 36
max_prediction_length = 6
training_cutoff = "YYYY-MM-DD" # day for cutoff
training = TimeSeriesDataSet(
data[lambda x: x.date < training_cutoff],
time_idx= ...,
target= ...,
# weight="weight",
group_ids=[ ... ],
max_encoder_length=max_encoder_length,
max_prediction_length=max_prediction_length,
static_categoricals=[ ... ],
static_reals=[ ... ],
time_varying_known_categoricals=[ ... ],
time_varying_known_reals=[ ... ],
time_varying_unknown_categoricals=[ ... ],
time_varying_unknown_reals=[ ... ],
)
# create validation and training dataset
validation = TimeSeriesDataSet.from_dataset(training, data, min_prediction_idx=training.index.time.max() + 1, stop_randomization=True)
batch_size = 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=2)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=2)
# define trainer with early stopping
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=1, verbose=False, mode="min")
lr_logger = LearningRateMonitor()
trainer = pl.Trainer(
max_epochs=100,
accelerator="auto",
gradient_clip_val=0.1,
limit_train_batches=30,
callbacks=[lr_logger, early_stop_callback],
)
# create the model
tft = TemporalFusionTransformer.from_dataset(
training,
learning_rate=0.03,
hidden_size=32,
attention_head_size=1,
dropout=0.1,
hidden_continuous_size=16,
output_size=7,
loss=QuantileLoss(),
log_interval=2,
reduce_on_plateau_patience=4
)
print(f"Number of parameters in network: {tft.size()/1e3:.1f}k")
# find optimal learning rate (set limit_train_batches to 1.0 and log_interval = -1)
res = Tuner(trainer).lr_find(
tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader, early_stop_threshold=1000.0, max_lr=0.3,
)
print(f"suggested learning rate: {res.suggestion()}")
fig = res.plot(show=True, suggest=True)
fig.show()
# fit the model
trainer.fit(
tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader,
)