TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LiftFormer: 3D Human Pose Estimation using attention models

LiftFormer: 3D Human Pose Estimation using attention models

Adrian Llopart

2020-09-013D Human Pose EstimationPose Estimation3D Pose Estimation
PaperPDF

Abstract

Estimating the 3D position of human joints has become a widely researched topic in the last years. Special emphasis has gone into defining novel methods that extrapolate 2-dimensional data (keypoints) into 3D, namely predicting the root-relative coordinates of joints associated to human skeletons. The latest research trends have proven that the Transformer Encoder blocks aggregate temporal information significantly better than previous approaches. Thus, we propose the usage of these models to obtain more accurate 3D predictions by leveraging temporal information using attention mechanisms on ordered sequences human poses in videos. Our method consistently outperforms the previous best results from the literature when using both 2D keypoint predictors by 0.3 mm (44.8 MPJPE, 0.7% improvement) and ground truth inputs by 2mm (MPJPE: 31.9, 8.4% improvement) on Human3.6M. It also achieves state-of-the-art performance on the HumanEva-I dataset with 10.5 P-MPJPE (22.2% reduction). The number of parameters in our model is easily tunable and is smaller (9.5M) than current methodologies (16.95M and 11.25M) whilst still having better performance. Thus, our 3D lifting model's accuracy exceeds that of other end-to-end or SMPL approaches and is comparable to many multi-view methods.

Results

TaskDatasetMetricValueModel
Pose EstimationHuman3.6MAverage MPJPE (mm)44.8Liftformer (n=243 CPN)
Pose EstimationHuman3.6MAverage MPJPE (mm)46Liftformer (n=81 CPN)
Pose EstimationHuman3.6MAverage MPJPE (mm)48.6Liftformer (n=27 CPN)
3DHuman3.6MAverage MPJPE (mm)44.8Liftformer (n=243 CPN)
3DHuman3.6MAverage MPJPE (mm)46Liftformer (n=81 CPN)
3DHuman3.6MAverage MPJPE (mm)48.6Liftformer (n=27 CPN)
3D Pose EstimationHuman3.6MAverage MPJPE (mm)44.8Liftformer (n=243 CPN)
3D Pose EstimationHuman3.6MAverage MPJPE (mm)46Liftformer (n=81 CPN)
3D Pose EstimationHuman3.6MAverage MPJPE (mm)48.6Liftformer (n=27 CPN)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)44.8Liftformer (n=243 CPN)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)46Liftformer (n=81 CPN)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)48.6Liftformer (n=27 CPN)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16