TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PatchMixer: A Patch-Mixing Architecture for Long-Term Time...

PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting

Zeying Gong, Yujin Tang, Junwei Liang

2023-10-01Time Series ForecastingTime Series
PaperPDFCodeCode(official)

Abstract

Although the Transformer has been the dominant architecture for time series forecasting tasks in recent years, a fundamental challenge remains: the permutation-invariant self-attention mechanism within Transformers leads to a loss of temporal information. To tackle these challenges, we propose PatchMixer, a novel CNN-based model. It introduces a permutation-variant convolutional structure to preserve temporal information. Diverging from conventional CNNs in this field, which often employ multiple scales or numerous branches, our method relies exclusively on depthwise separable convolutions. This allows us to extract both local features and global correlations using a single-scale architecture. Furthermore, we employ dual forecasting heads encompassing linear and nonlinear components to better model future curve trends and details. Our experimental results on seven time-series forecasting benchmarks indicate that compared with the state-of-the-art method and the best-performing CNN, PatchMixer yields $3.9\%$ and $21.2\%$ relative improvements, respectively, while being 2-3x faster than the most advanced method.

Results

TaskDatasetMetricValueModel
Time Series ForecastingETTh2 (336) UnivariateMAE0.332PatchMixer
Time Series ForecastingETTh2 (336) UnivariateMSE0.166PatchMixer
Time Series ForecastingETTh2 (720) MultivariateMAE0.426PatchMixer
Time Series ForecastingETTh2 (720) MultivariateMSE0.393PatchMixer
Time Series ForecastingETTh1 (720) MultivariateMAE0.463PatchMixer
Time Series ForecastingETTh1 (720) MultivariateMSE0.445PatchMixer
Time Series ForecastingETTh2 (336) MultivariateMAE0.368PatchMixer
Time Series ForecastingETTh2 (336) MultivariateMSE0.317PatchMixer
Time Series ForecastingETTh1 (720) UnivariateMAE0.243PatchMixer
Time Series ForecastingETTh1 (720) UnivariateMSE0.093PatchMixer
Time Series ForecastingETTh1 (96) UnivariateMAE0.179PatchMixer
Time Series ForecastingETTh1 (96) UnivariateMSE0.054PatchMixer
Time Series ForecastingETTh1 (192) MultivariateMAE0.394PatchMixer
Time Series ForecastingETTh1 (192) MultivariateMSE0.373PatchMixer
Time Series ForecastingETTh2 (192) UnivariateMAE0.305PatchMixer
Time Series ForecastingETTh2 (192) UnivariateMSE0.147PatchMixer
Time Series ForecastingETTh1 (192) UnivariateMAE0.198PatchMixer
Time Series ForecastingETTh1 (192) UnivariateMSE0.066PatchMixer
Time Series ForecastingETTh1 (336) MultivariateMAE0.414PatchMixer
Time Series ForecastingETTh1 (336) MultivariateMSE0.392PatchMixer
Time Series ForecastingETTh2 (96) MultivariateMAE0.3PatchMixer
Time Series ForecastingETTh2 (96) MultivariateMSE0.225PatchMixer
Time Series ForecastingETTh2 (720) UnivariateMAE0.374PatchMixer
Time Series ForecastingETTh2 (720) UnivariateMSE0.217PatchMixer
Time Series ForecastingETTh1 (96) MultivariateMAE0.381PatchMixer
Time Series ForecastingETTh1 (96) MultivariateMSE0.353PatchMixer
Time Series ForecastingETTh1 (336) UnivariateMAE0.22PatchMixer
Time Series ForecastingETTh1 (336) UnivariateMSE0.078PatchMixer
Time Series ForecastingETTh2 (192) MultivariateMAE0.334PatchMixer
Time Series ForecastingETTh2 (192) MultivariateMSE0.274PatchMixer
Time Series ForecastingETTh2 (96) UnivariateMAE0.268PatchMixer
Time Series ForecastingETTh2 (96) UnivariateMSE0.119PatchMixer
Time Series AnalysisETTh2 (336) UnivariateMAE0.332PatchMixer
Time Series AnalysisETTh2 (336) UnivariateMSE0.166PatchMixer
Time Series AnalysisETTh2 (720) MultivariateMAE0.426PatchMixer
Time Series AnalysisETTh2 (720) MultivariateMSE0.393PatchMixer
Time Series AnalysisETTh1 (720) MultivariateMAE0.463PatchMixer
Time Series AnalysisETTh1 (720) MultivariateMSE0.445PatchMixer
Time Series AnalysisETTh2 (336) MultivariateMAE0.368PatchMixer
Time Series AnalysisETTh2 (336) MultivariateMSE0.317PatchMixer
Time Series AnalysisETTh1 (720) UnivariateMAE0.243PatchMixer
Time Series AnalysisETTh1 (720) UnivariateMSE0.093PatchMixer
Time Series AnalysisETTh1 (96) UnivariateMAE0.179PatchMixer
Time Series AnalysisETTh1 (96) UnivariateMSE0.054PatchMixer
Time Series AnalysisETTh1 (192) MultivariateMAE0.394PatchMixer
Time Series AnalysisETTh1 (192) MultivariateMSE0.373PatchMixer
Time Series AnalysisETTh2 (192) UnivariateMAE0.305PatchMixer
Time Series AnalysisETTh2 (192) UnivariateMSE0.147PatchMixer
Time Series AnalysisETTh1 (192) UnivariateMAE0.198PatchMixer
Time Series AnalysisETTh1 (192) UnivariateMSE0.066PatchMixer
Time Series AnalysisETTh1 (336) MultivariateMAE0.414PatchMixer
Time Series AnalysisETTh1 (336) MultivariateMSE0.392PatchMixer
Time Series AnalysisETTh2 (96) MultivariateMAE0.3PatchMixer
Time Series AnalysisETTh2 (96) MultivariateMSE0.225PatchMixer
Time Series AnalysisETTh2 (720) UnivariateMAE0.374PatchMixer
Time Series AnalysisETTh2 (720) UnivariateMSE0.217PatchMixer
Time Series AnalysisETTh1 (96) MultivariateMAE0.381PatchMixer
Time Series AnalysisETTh1 (96) MultivariateMSE0.353PatchMixer
Time Series AnalysisETTh1 (336) UnivariateMAE0.22PatchMixer
Time Series AnalysisETTh1 (336) UnivariateMSE0.078PatchMixer
Time Series AnalysisETTh2 (192) MultivariateMAE0.334PatchMixer
Time Series AnalysisETTh2 (192) MultivariateMSE0.274PatchMixer
Time Series AnalysisETTh2 (96) UnivariateMAE0.268PatchMixer
Time Series AnalysisETTh2 (96) UnivariateMSE0.119PatchMixer

Related Papers

The Power of Architecture: Deep Dive into Transformer Architectures for Long-Term Time Series Forecasting2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15D3FL: Data Distribution and Detrending for Robust Federated Learning in Non-linear Time-series Data2025-07-15Towards Interpretable Time Series Foundation Models2025-07-10MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models2025-07-09Foundation models for time series forecasting: Application in conformal prediction2025-07-09Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching2025-07-09