TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Generative Pretrained Hierarchical Transformer for Time Se...

Generative Pretrained Hierarchical Transformer for Time Series Forecasting

Zhiding Liu, Jiqian Yang, Mingyue Cheng, Yucong Luo, Zhi Li

2024-02-26Few-Shot LearningTime Series ForecastingTime Series
PaperPDFCode(official)Code

Abstract

Recent efforts have been dedicated to enhancing time series forecasting accuracy by introducing advanced network architectures and self-supervised pretraining strategies. Nevertheless, existing approaches still exhibit two critical drawbacks. Firstly, these methods often rely on a single dataset for training, limiting the model's generalizability due to the restricted scale of the training data. Secondly, the one-step generation schema is widely followed, which necessitates a customized forecasting head and overlooks the temporal dependencies in the output series, and also leads to increased training costs under different horizon length settings. To address these issues, we propose a novel generative pretrained hierarchical transformer architecture for forecasting, named \textbf{GPHT}. There are two aspects of key designs in GPHT. On the one hand, we advocate for constructing a mixed dataset under the channel-independent assumption for pretraining our model, comprising various datasets from diverse data scenarios. This approach significantly expands the scale of training data, allowing our model to uncover commonalities in time series data and facilitating improved transfer to specific datasets. On the other hand, GPHT employs an auto-regressive forecasting approach, effectively modeling temporal dependencies in the output series. Importantly, no customized forecasting head is required, enabling \textit{a single model to forecast at arbitrary horizon settings.} We conduct sufficient experiments on eight datasets with mainstream self-supervised pretraining models and supervised models. The results demonstrated that GPHT surpasses the baseline models across various fine-tuning and zero/few-shot learning settings in the traditional long-term forecasting task. We make our codes publicly available\footnote{https://github.com/icantnamemyself/GPHT}.

Results

TaskDatasetMetricValueModel
Time Series ForecastingETTh1 (336) MultivariateMAE0.423GPHT
Time Series ForecastingETTh1 (336) MultivariateMSE0.43GPHT
Time Series ForecastingETTh1 (336) MultivariateMAE0.432GPHT*
Time Series ForecastingETTh1 (336) MultivariateMSE0.456GPHT*
Time Series AnalysisETTh1 (336) MultivariateMAE0.423GPHT
Time Series AnalysisETTh1 (336) MultivariateMSE0.43GPHT
Time Series AnalysisETTh1 (336) MultivariateMAE0.432GPHT*
Time Series AnalysisETTh1 (336) MultivariateMSE0.456GPHT*

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17The Power of Architecture: Deep Dive into Transformer Architectures for Long-Term Time Series Forecasting2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15D3FL: Data Distribution and Detrending for Robust Federated Learning in Non-linear Time-series Data2025-07-15Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis2025-07-10Towards Interpretable Time Series Foundation Models2025-07-10