TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/TSMixer: Lightweight MLP-Mixer Model for Multivariate Time...

TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting

Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, Jayant Kalagnanam

2023-06-14Representation LearningSelf-Supervised LearningTime Series ForecastingTime SeriesMultivariate Time Series Forecasting
PaperPDFCode(official)

Abstract

Transformers have gained popularity in time series forecasting for their ability to capture long-sequence interactions. However, their high memory and computing requirements pose a critical bottleneck for long-term forecasting. To address this, we propose TSMixer, a lightweight neural architecture exclusively composed of multi-layer perceptron (MLP) modules for multivariate forecasting and representation learning on patched time series. Inspired by MLP-Mixer's success in computer vision, we adapt it for time series, addressing challenges and introducing validated components for enhanced accuracy. This includes a novel design paradigm of attaching online reconciliation heads to the MLP-Mixer backbone, for explicitly modeling the time-series properties such as hierarchy and channel-correlations. We also propose a novel Hybrid channel modeling and infusion of a simple gating approach to effectively handle noisy channel interactions and generalization across diverse datasets. By incorporating these lightweight components, we significantly enhance the learning capability of simple MLP structures, outperforming complex Transformer models with minimal computing usage. Moreover, TSMixer's modular design enables compatibility with both supervised and masked self-supervised learning methods, making it a promising building block for time-series Foundation Models. TSMixer outperforms state-of-the-art MLP and Transformer models in forecasting by a considerable margin of 8-60%. It also outperforms the latest strong benchmarks of Patch-Transformer models (by 1-2%) with a significant reduction in memory and runtime (2-3X). The source code of our model is officially released as PatchTSMixer in the HuggingFace. Model: https://huggingface.co/docs/transformers/main/en/model_doc/patchtsmixer Examples: https://github.com/ibm/tsfm/#notebooks-links

Results

TaskDatasetMetricValueModel
Time Series ForecastingETTh2 (720) MultivariateMAE0.436TSMixer
Time Series ForecastingETTh2 (720) MultivariateMSE0.395TSMixer
Time Series ForecastingTraffic (96)MSE0.356TSMixer
Time Series ForecastingETTh1 (720) MultivariateMAE0.467TSMixer
Time Series ForecastingETTh1 (720) MultivariateMSE0.444TSMixer
Time Series ForecastingTraffic (192)MSE0.377TSMixer
Time Series ForecastingETTm1 (192) MultivariateMAE0.369TSMixer
Time Series ForecastingETTm1 (192) MultivariateMSE0.333TSMixer
Time Series ForecastingWeather (192)MAE0.24TSMixer
Time Series ForecastingWeather (192)MSE0.191TSMixer
Time Series ForecastingWeather (336)MAE0.279TSMixer
Time Series ForecastingWeather (336)MSE0.243TSMixer
Time Series ForecastingETTm2 (96) MultivariateMAE0.255TSMixer
Time Series ForecastingETTm2 (96) MultivariateMSE0.164TSMixer
Time Series ForecastingElectricity (336)MSE0.158TSMixer
Time Series ForecastingWeather (720)MAE0.333TSMixer
Time Series ForecastingWeather (720)MSE0.316TSMixer
Time Series ForecastingETTh2 (336) MultivariateMAE0.401TSMixer
Time Series ForecastingETTh2 (336) MultivariateMSE0.357TSMixer
Time Series ForecastingETTh1 (192) MultivariateMAE0.418TSMixer
Time Series ForecastingETTh1 (192) MultivariateMSE0.399TSMixer
Time Series ForecastingTraffic (720)MSE0.424TSMixer
Time Series ForecastingETTh1 (96)MAE0.398TSMixer
Time Series ForecastingETTh1 (96)MSE0.368TSMixer
Time Series ForecastingTraffic (336)MSE0.385TSMixer
Time Series ForecastingETTh1 (336) MultivariateMAE0.436TSMixer
Time Series ForecastingETTh1 (336) MultivariateMSE0.421TSMixer
Time Series ForecastingETTm2 (336) MultivariateMAE0.329TSMixer
Time Series ForecastingETTm2 (336) MultivariateMSE0.273TSMixer
Time Series ForecastingETTm1 (336) MultivariateMAE0.385TSMixer
Time Series ForecastingETTm1 (336) MultivariateMSE0.365TSMixer
Time Series ForecastingETTm1 (96) MultivariateMAE0.346TSMixer
Time Series ForecastingETTm1 (96) MultivariateMSE0.291TSMixer
Time Series ForecastingETTm1 (720) MultivariateMAE0.413TSMixer
Time Series ForecastingETTm1 (720) MultivariateMSE0.416TSMixer
Time Series ForecastingETTh2 (96) MultivariateMAE0.337TSMixer
Time Series ForecastingETTh2 (96) MultivariateMSE0.276TSMixer
Time Series ForecastingWeather (96)MAE0.197TSMixer
Time Series ForecastingWeather (96)MSE0.146TSMixer
Time Series ForecastingETTm2 (192) MultivariateMAE0.293TSMixer
Time Series ForecastingETTm2 (192) MultivariateMSE0.219TSMixer
Time Series ForecastingETTh1 (96) MultivariateMAE0.398TSMixer
Time Series ForecastingETTh1 (96) MultivariateMSE0.368TSMixer
Time Series ForecastingElectricity (96)MSE0.129TSMixer
Time Series ForecastingETTh2 (192) MultivariateMAE0.374TSMixer
Time Series ForecastingETTh2 (192) MultivariateMSE0.33TSMixer
Time Series ForecastingETTm2 (720) MultivariateMAE0.38TSMixer
Time Series ForecastingETTm2 (720) MultivariateMSE0.358TSMixer
Time Series ForecastingETTh1 (720) MultivariateMSE0.444TSMixer
Time Series ForecastingETTh1 (96) MultivariateMAE0.398TSMixer
Time Series ForecastingETTh1 (96) MultivariateMSE0.368TSMixer
Time Series ForecastingETTh1 (192) MultivariateMAE0.418TSMixer
Time Series ForecastingETTh1 (192) MultivariateMSE0.399TSMixer
Time Series ForecastingETTh1 (336) MultivariateMSE0.421TSMixer
Time Series AnalysisETTh2 (720) MultivariateMAE0.436TSMixer
Time Series AnalysisETTh2 (720) MultivariateMSE0.395TSMixer
Time Series AnalysisTraffic (96)MSE0.356TSMixer
Time Series AnalysisETTh1 (720) MultivariateMAE0.467TSMixer
Time Series AnalysisETTh1 (720) MultivariateMSE0.444TSMixer
Time Series AnalysisTraffic (192)MSE0.377TSMixer
Time Series AnalysisETTm1 (192) MultivariateMAE0.369TSMixer
Time Series AnalysisETTm1 (192) MultivariateMSE0.333TSMixer
Time Series AnalysisWeather (192)MAE0.24TSMixer
Time Series AnalysisWeather (192)MSE0.191TSMixer
Time Series AnalysisWeather (336)MAE0.279TSMixer
Time Series AnalysisWeather (336)MSE0.243TSMixer
Time Series AnalysisETTm2 (96) MultivariateMAE0.255TSMixer
Time Series AnalysisETTm2 (96) MultivariateMSE0.164TSMixer
Time Series AnalysisElectricity (336)MSE0.158TSMixer
Time Series AnalysisWeather (720)MAE0.333TSMixer
Time Series AnalysisWeather (720)MSE0.316TSMixer
Time Series AnalysisETTh2 (336) MultivariateMAE0.401TSMixer
Time Series AnalysisETTh2 (336) MultivariateMSE0.357TSMixer
Time Series AnalysisETTh1 (192) MultivariateMAE0.418TSMixer
Time Series AnalysisETTh1 (192) MultivariateMSE0.399TSMixer
Time Series AnalysisTraffic (720)MSE0.424TSMixer
Time Series AnalysisETTh1 (96)MAE0.398TSMixer
Time Series AnalysisETTh1 (96)MSE0.368TSMixer
Time Series AnalysisTraffic (336)MSE0.385TSMixer
Time Series AnalysisETTh1 (336) MultivariateMAE0.436TSMixer
Time Series AnalysisETTh1 (336) MultivariateMSE0.421TSMixer
Time Series AnalysisETTm2 (336) MultivariateMAE0.329TSMixer
Time Series AnalysisETTm2 (336) MultivariateMSE0.273TSMixer
Time Series AnalysisETTm1 (336) MultivariateMAE0.385TSMixer
Time Series AnalysisETTm1 (336) MultivariateMSE0.365TSMixer
Time Series AnalysisETTm1 (96) MultivariateMAE0.346TSMixer
Time Series AnalysisETTm1 (96) MultivariateMSE0.291TSMixer
Time Series AnalysisETTm1 (720) MultivariateMAE0.413TSMixer
Time Series AnalysisETTm1 (720) MultivariateMSE0.416TSMixer
Time Series AnalysisETTh2 (96) MultivariateMAE0.337TSMixer
Time Series AnalysisETTh2 (96) MultivariateMSE0.276TSMixer
Time Series AnalysisWeather (96)MAE0.197TSMixer
Time Series AnalysisWeather (96)MSE0.146TSMixer
Time Series AnalysisETTm2 (192) MultivariateMAE0.293TSMixer
Time Series AnalysisETTm2 (192) MultivariateMSE0.219TSMixer
Time Series AnalysisETTh1 (96) MultivariateMAE0.398TSMixer
Time Series AnalysisETTh1 (96) MultivariateMSE0.368TSMixer
Time Series AnalysisElectricity (96)MSE0.129TSMixer
Time Series AnalysisETTh2 (192) MultivariateMAE0.374TSMixer
Time Series AnalysisETTh2 (192) MultivariateMSE0.33TSMixer
Time Series AnalysisETTm2 (720) MultivariateMAE0.38TSMixer
Time Series AnalysisETTm2 (720) MultivariateMSE0.358TSMixer
Time Series AnalysisETTh1 (720) MultivariateMSE0.444TSMixer
Time Series AnalysisETTh1 (96) MultivariateMAE0.398TSMixer
Time Series AnalysisETTh1 (96) MultivariateMSE0.368TSMixer
Time Series AnalysisETTh1 (192) MultivariateMAE0.418TSMixer
Time Series AnalysisETTh1 (192) MultivariateMSE0.399TSMixer
Time Series AnalysisETTh1 (336) MultivariateMSE0.421TSMixer
Multivariate Time Series ForecastingETTh1 (720) MultivariateMSE0.444TSMixer
Multivariate Time Series ForecastingETTh1 (96) MultivariateMAE0.398TSMixer
Multivariate Time Series ForecastingETTh1 (96) MultivariateMSE0.368TSMixer
Multivariate Time Series ForecastingETTh1 (192) MultivariateMAE0.418TSMixer
Multivariate Time Series ForecastingETTh1 (192) MultivariateMSE0.399TSMixer
Multivariate Time Series ForecastingETTh1 (336) MultivariateMSE0.421TSMixer

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17The Power of Architecture: Deep Dive into Transformer Architectures for Long-Term Time Series Forecasting2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16