Cheng Feng, Long Huang, Denis Krompass
We present General Time Transformer (GTT), an encoder-only style foundation model for zero-shot multivariate time series forecasting. GTT is pretrained on a large dataset of 200M high-quality time series samples spanning diverse domains. In our proposed framework, the task of multivariate time series forecasting is formulated as a channel-wise next curve shape prediction problem, where each time series sample is represented as a sequence of non-overlapping curve shapes with a unified numerical magnitude. GTT is trained to predict the next curve shape based on a window of past curve shapes in a channel-wise manner. Experimental results demonstrate that GTT exhibits superior zero-shot multivariate forecasting capabilities on unseen time series datasets, even surpassing state-of-the-art supervised baselines. Additionally, we investigate the impact of varying GTT model parameters and training dataset scales, observing that the scaling law also holds in the context of zero-shot multivariate time series forecasting.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.419 | GTT-Large |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.424 | GTT-Large |
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.418 | GTT-Large(Fine-tune) |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.433 | GTT-Large(Fine-tune) |
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.427 | GTT-Smal |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.459 | GTT-Smal |
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.436 | GTT-Tiny |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.466 | GTT-Tiny |
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.432 | GTT-Large(100M traing samples) |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.468 | GTT-Large(100M traing samples) |
| Time Series Forecasting | ETTh1 (336) Multivariate | MAE | 0.444 | GTT-Large(50M traing samples) |
| Time Series Forecasting | ETTh1 (336) Multivariate | MSE | 0.475 | GTT-Large(50M traing samples) |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.419 | GTT-Large |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.424 | GTT-Large |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.418 | GTT-Large(Fine-tune) |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.433 | GTT-Large(Fine-tune) |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.427 | GTT-Smal |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.459 | GTT-Smal |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.436 | GTT-Tiny |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.466 | GTT-Tiny |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.432 | GTT-Large(100M traing samples) |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.468 | GTT-Large(100M traing samples) |
| Time Series Analysis | ETTh1 (336) Multivariate | MAE | 0.444 | GTT-Large(50M traing samples) |
| Time Series Analysis | ETTh1 (336) Multivariate | MSE | 0.475 | GTT-Large(50M traing samples) |