TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MotionBERT: A Unified Perspective on Learning Human Motion...

MotionBERT: A Unified Perspective on Learning Human Motion Representations

Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang

2022-10-12ICCV 2023 13D Human Pose EstimationOne-Shot 3D Action RecognitionMonocular 3D Human Pose EstimationSkeleton Based Action RecognitionPose EstimationAction RecognitionClassification3D Pose Estimation
PaperPDFCode(official)

Abstract

We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources. Specifically, we propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations. The motion representations acquired in this way incorporate geometric, kinematic, and physical knowledge about human motion, which can be easily transferred to multiple downstream tasks. We implement the motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network. It could capture long-range spatio-temporal relationships among the skeletal joints comprehensively and adaptively, exemplified by the lowest 3D pose estimation error so far when trained from scratch. Furthermore, our proposed framework achieves state-of-the-art performance on all three downstream tasks by simply finetuning the pretrained motion encoder with a simple regression head (1-2 layers), which demonstrates the versatility of the learned motion representations. Code and models are available at https://motionbert.github.io/

Results

TaskDatasetMetricValueModel
3D Human Pose Estimation3DPWMPJPE68.8MotionBERT-HybrIK
3D Human Pose Estimation3DPWMPVPE79.4MotionBERT-HybrIK
3D Human Pose Estimation3DPWPA-MPJPE40.6MotionBERT-HybrIK
3D Human Pose Estimation3DPWMPJPE76.9MotionBERT (Finetune)
3D Human Pose Estimation3DPWMPVPE88.1MotionBERT (Finetune)
3D Human Pose Estimation3DPWPA-MPJPE47.2MotionBERT (Finetune)
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)37.5MotionBERT (Finetune)
3D Human Pose EstimationHuman3.6MFrames Needed243MotionBERT (Finetune)
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)39.2MotionBERT (Scratch)
3D Human Pose EstimationHuman3.6MFrames Needed243MotionBERT (Scratch)
VideoNTU RGB+DAccuracy (CS)93MotionBert (finetune)
VideoNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
Zero-Shot LearningNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Zero-Shot LearningNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
Activity RecognitionNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Activity RecognitionNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
Action LocalizationNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Action LocalizationNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
Pose Estimation3DPWMPJPE68.8MotionBERT-HybrIK
Pose Estimation3DPWMPVPE79.4MotionBERT-HybrIK
Pose Estimation3DPWPA-MPJPE40.6MotionBERT-HybrIK
Pose Estimation3DPWMPJPE76.9MotionBERT (Finetune)
Pose Estimation3DPWMPVPE88.1MotionBERT (Finetune)
Pose Estimation3DPWPA-MPJPE47.2MotionBERT (Finetune)
Pose EstimationHuman3.6MAverage MPJPE (mm)37.5MotionBERT (Finetune)
Pose EstimationHuman3.6MFrames Needed243MotionBERT (Finetune)
Pose EstimationHuman3.6MAverage MPJPE (mm)39.2MotionBERT (Scratch)
Pose EstimationHuman3.6MFrames Needed243MotionBERT (Scratch)
Action DetectionNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Action DetectionNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
3D Action RecognitionNTU RGB+DAccuracy (CS)93MotionBert (finetune)
3D Action RecognitionNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
3D3DPWMPJPE68.8MotionBERT-HybrIK
3D3DPWMPVPE79.4MotionBERT-HybrIK
3D3DPWPA-MPJPE40.6MotionBERT-HybrIK
3D3DPWMPJPE76.9MotionBERT (Finetune)
3D3DPWMPVPE88.1MotionBERT (Finetune)
3D3DPWPA-MPJPE47.2MotionBERT (Finetune)
3DHuman3.6MAverage MPJPE (mm)37.5MotionBERT (Finetune)
3DHuman3.6MFrames Needed243MotionBERT (Finetune)
3DHuman3.6MAverage MPJPE (mm)39.2MotionBERT (Scratch)
3DHuman3.6MFrames Needed243MotionBERT (Scratch)
Action RecognitionNTU RGB+DAccuracy (CS)93MotionBert (finetune)
Action RecognitionNTU RGB+DAccuracy (CV)97.2MotionBert (finetune)
ClassificationFull-body Parkinson’s disease datasetF1-score (weighted)0.47MotionBERT
ClassificationFull-body Parkinson’s disease datasetF1-score (weighted)0.43MotionBERT-LITE
1 Image, 2*2 Stitchi3DPWMPJPE68.8MotionBERT-HybrIK
1 Image, 2*2 Stitchi3DPWMPVPE79.4MotionBERT-HybrIK
1 Image, 2*2 Stitchi3DPWPA-MPJPE40.6MotionBERT-HybrIK
1 Image, 2*2 Stitchi3DPWMPJPE76.9MotionBERT (Finetune)
1 Image, 2*2 Stitchi3DPWMPVPE88.1MotionBERT (Finetune)
1 Image, 2*2 Stitchi3DPWPA-MPJPE47.2MotionBERT (Finetune)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)37.5MotionBERT (Finetune)
1 Image, 2*2 StitchiHuman3.6MFrames Needed243MotionBERT (Finetune)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)39.2MotionBERT (Scratch)
1 Image, 2*2 StitchiHuman3.6MFrames Needed243MotionBERT (Scratch)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16