TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Back to Optimization: Diffusion-based Zero-Shot 3D Human P...

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Zhongyu Jiang, Zhuoran Zhou, Lei LI, Wenhao Chai, Cheng-Yen Yang, Jenq-Neng Hwang

2023-07-073D Human Pose EstimationImage to 3DPose Estimation
PaperPDFCode(official)

Abstract

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods. Nonetheless, 3D HPE in the wild is still the biggest challenge for learning-based models, whether with 2D-3D lifting, image-to-3D, or diffusion-based methods, since the trained networks implicitly learn camera intrinsic parameters and domain-based 3D human pose distributions and estimate poses by statistical average. On the other hand, the optimization-based methods estimate results case-by-case, which can predict more diverse and sophisticated human poses in the wild. By combining the advantages of optimization-based and learning-based methods, we propose the \textbf{Ze}ro-shot \textbf{D}iffusion-based \textbf{O}ptimization (\textbf{ZeDO}) pipeline for 3D HPE to solve the problem of cross-domain and in-the-wild 3D HPE. Our multi-hypothesis \textit{\textbf{ZeDO}} achieves state-of-the-art (SOTA) performance on Human3.6M, with minMPJPE $51.4$mm, without training with any 2D-3D or image-3D pairs. Moreover, our single-hypothesis \textit{\textbf{ZeDO}} achieves SOTA performance on 3DPW dataset with PA-MPJPE $40.3$mm on cross-dataset evaluation, which even outperforms learning-based methods trained on 3DPW.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationMPI-INF-3DHPAUC65.6ZeDO (S=50)
3D Human Pose EstimationMPI-INF-3DHPMPJPE55.2ZeDO (S=50)
3D Human Pose EstimationMPI-INF-3DHPPCK93ZeDO (S=50)
3D Human Pose Estimation3DPWMPJPE69.7ZeDO (S=1,J=17)
3D Human Pose Estimation3DPWPA-MPJPE40.3ZeDO (S=1,J=17)
3D Human Pose Estimation3DPWMPJPE80.9ZeDO (Cross Dataset)
3D Human Pose Estimation3DPWPA-MPJPE42.6ZeDO (Cross Dataset)
Pose EstimationMPI-INF-3DHPAUC65.6ZeDO (S=50)
Pose EstimationMPI-INF-3DHPMPJPE55.2ZeDO (S=50)
Pose EstimationMPI-INF-3DHPPCK93ZeDO (S=50)
Pose Estimation3DPWMPJPE69.7ZeDO (S=1,J=17)
Pose Estimation3DPWPA-MPJPE40.3ZeDO (S=1,J=17)
Pose Estimation3DPWMPJPE80.9ZeDO (Cross Dataset)
Pose Estimation3DPWPA-MPJPE42.6ZeDO (Cross Dataset)
3DMPI-INF-3DHPAUC65.6ZeDO (S=50)
3DMPI-INF-3DHPMPJPE55.2ZeDO (S=50)
3DMPI-INF-3DHPPCK93ZeDO (S=50)
3D3DPWMPJPE69.7ZeDO (S=1,J=17)
3D3DPWPA-MPJPE40.3ZeDO (S=1,J=17)
3D3DPWMPJPE80.9ZeDO (Cross Dataset)
3D3DPWPA-MPJPE42.6ZeDO (Cross Dataset)
1 Image, 2*2 StitchiMPI-INF-3DHPAUC65.6ZeDO (S=50)
1 Image, 2*2 StitchiMPI-INF-3DHPMPJPE55.2ZeDO (S=50)
1 Image, 2*2 StitchiMPI-INF-3DHPPCK93ZeDO (S=50)
1 Image, 2*2 Stitchi3DPWMPJPE69.7ZeDO (S=1,J=17)
1 Image, 2*2 Stitchi3DPWPA-MPJPE40.3ZeDO (S=1,J=17)
1 Image, 2*2 Stitchi3DPWMPJPE80.9ZeDO (Cross Dataset)
1 Image, 2*2 Stitchi3DPWPA-MPJPE42.6ZeDO (Cross Dataset)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17PhysX: Physical-Grounded 3D Asset Generation2025-07-16SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16