TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Pre...

Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction

Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu

2023-04-07Human Pose ForecastingHuman motion predictionmotion prediction
PaperPDF

Abstract

In recent years, Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory. Recently, MLP-Mixer, initially developed for vision tasks, has been leveraged into human motion prediction as a promising alternative to GCNs, which achieves both better performance and better efficiency than GCNs. Unlike GCNs, which can explicitly capture human skeleton's bone-joint structure by representing it as a graph with edges and nodes, MLP-Mixer relies on fully connected layers and thus cannot explicitly model such graph-like structure of human's. To break this limitation of MLP-Mixer's, we propose \textit{Graph-Guided Mixer}, a novel approach that equips the original MLP-Mixer architecture with the capability to model graph structure. By incorporating graph guidance, our \textit{Graph-Guided Mixer} can effectively capture and utilize the specific connectivity patterns within human skeleton's graph representation. In this paper, first we uncover a theoretical connection between MLP-Mixer and GCN that is unexplored in existing research. Building on this theoretical connection, next we present our proposed \textit{Graph-Guided Mixer}, explaining how the original MLP-Mixer architecture is reinvented to incorporate guidance from graph structure. Then we conduct an extensive evaluation on the Human3.6M, AMASS, and 3DPW datasets, which shows that our method achieves state-of-the-art performance.

Results

TaskDatasetMetricValueModel
Pose EstimationHuman3.6MAverage MPJPE (mm) @ 1000 ms108.6GraphMixer
Pose EstimationHuman3.6MAverage MPJPE (mm) @ 400ms56.7GraphMixer
3DHuman3.6MAverage MPJPE (mm) @ 1000 ms108.6GraphMixer
3DHuman3.6MAverage MPJPE (mm) @ 400ms56.7GraphMixer
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm) @ 1000 ms108.6GraphMixer
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm) @ 400ms56.7GraphMixer

Related Papers

Stochastic Human Motion Prediction with Memory of Action Transition and Action Characteristic2025-07-05Temporal Continual Learning with Prior Compensation for Human Motion Prediction2025-07-05AMPLIFY: Actionless Motion Priors for Robot Learning from Videos2025-06-17FocalAD: Local Motion Planning for End-to-End Autonomous Driving2025-06-13TrajFlow: Multi-modal Motion Prediction via Flow Matching2025-06-10HUMOF: Human Motion Forecasting in Interactive Social Scenes2025-06-04Rodrigues Network for Learning Robot Actions2025-06-03Autoregression-free video prediction using diffusion model for mitigating error propagation2025-05-28