TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Diffusion-Based 3D Human Pose Estimation with Multi-Hypoth...

Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation

Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao

2023-03-21ICCV 2023 13D Human Pose EstimationMonocular 3D Human Pose EstimationMulti-Hypotheses 3D Human Pose EstimationPose Estimation3D Pose Estimation
PaperPDFCode(official)

Abstract

In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D human pose estimation. On the one hand, D3DP generates multiple possible 3D pose hypotheses for a single 2D observation. It gradually diffuses the ground truth 3D poses to a random distribution, and learns a denoiser conditioned on 2D keypoints to recover the uncontaminated 3D poses. The proposed D3DP is compatible with existing 3D pose estimators and supports users to balance efficiency and accuracy during inference through two customizable parameters. On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use. It reprojects 3D pose hypotheses to the 2D camera plane, selects the best hypothesis joint-by-joint based on the reprojection errors, and combines the selected joints into the final pose. The proposed JPMA conducts aggregation at the joint level and makes use of the 2D prior information, both of which have been overlooked by previous approaches. Extensive experiments on Human3.6M and MPI-INF-3DHP datasets show that our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively. Code is available at https://github.com/paTRICK-swk/D3DP.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationMPI-INF-3DHPAUC78.2D3DP (N=243, H=20, K=20, J-Agg)
3D Human Pose EstimationMPI-INF-3DHPMPJPE29.7D3DP (N=243, H=20, K=20, J-Agg)
3D Human Pose EstimationMPI-INF-3DHPPCK97.7D3DP (N=243, H=20, K=20, J-Agg)
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)39.5D3DP
3D Human Pose EstimationHuman3.6MFrames Needed243D3DP
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)35.4D3DP
Pose EstimationMPI-INF-3DHPAUC78.2D3DP (N=243, H=20, K=20, J-Agg)
Pose EstimationMPI-INF-3DHPMPJPE29.7D3DP (N=243, H=20, K=20, J-Agg)
Pose EstimationMPI-INF-3DHPPCK97.7D3DP (N=243, H=20, K=20, J-Agg)
Pose EstimationHuman3.6MAverage MPJPE (mm)39.5D3DP
Pose EstimationHuman3.6MFrames Needed243D3DP
Pose EstimationHuman3.6MAverage MPJPE (mm)35.4D3DP
3DMPI-INF-3DHPAUC78.2D3DP (N=243, H=20, K=20, J-Agg)
3DMPI-INF-3DHPMPJPE29.7D3DP (N=243, H=20, K=20, J-Agg)
3DMPI-INF-3DHPPCK97.7D3DP (N=243, H=20, K=20, J-Agg)
3DHuman3.6MAverage MPJPE (mm)39.5D3DP
3DHuman3.6MFrames Needed243D3DP
3DHuman3.6MAverage MPJPE (mm)35.4D3DP
1 Image, 2*2 StitchiMPI-INF-3DHPAUC78.2D3DP (N=243, H=20, K=20, J-Agg)
1 Image, 2*2 StitchiMPI-INF-3DHPMPJPE29.7D3DP (N=243, H=20, K=20, J-Agg)
1 Image, 2*2 StitchiMPI-INF-3DHPPCK97.7D3DP (N=243, H=20, K=20, J-Agg)
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)39.5D3DP
1 Image, 2*2 StitchiHuman3.6MFrames Needed243D3DP
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)35.4D3DP

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16