Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Zhongyu Jiang, Zhuoran Zhou, Lei LI, Wenhao Chai, Cheng-Yen Yang, Jenq-Neng Hwang

2023-07-073D Human Pose Estimation Image to 3D Pose Estimation

Abstract

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods. Nonetheless, 3D HPE in the wild is still the biggest challenge for learning-based models, whether with 2D-3D lifting, image-to-3D, or diffusion-based methods, since the trained networks implicitly learn camera intrinsic parameters and domain-based 3D human pose distributions and estimate poses by statistical average. On the other hand, the optimization-based methods estimate results case-by-case, which can predict more diverse and sophisticated human poses in the wild. By combining the advantages of optimization-based and learning-based methods, we propose the \textbf{Ze}ro-shot \textbf{D}iffusion-based \textbf{O}ptimization (\textbf{ZeDO}) pipeline for 3D HPE to solve the problem of cross-domain and in-the-wild 3D HPE. Our multi-hypothesis \textit{\textbf{ZeDO}} achieves state-of-the-art (SOTA) performance on Human3.6M, with minMPJPE $51.4$mm, without training with any 2D-3D or image-3D pairs. Moreover, our single-hypothesis \textit{\textbf{ZeDO}} achieves SOTA performance on 3DPW dataset with PA-MPJPE $40.3$mm on cross-dataset evaluation, which even outperforms learning-based methods trained on 3DPW.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	AUC	65.6	ZeDO (S=50)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	55.2	ZeDO (S=50)
3D Human Pose Estimation	MPI-INF-3DHP	PCK	93	ZeDO (S=50)
3D Human Pose Estimation	3DPW	MPJPE	69.7	ZeDO (S=1,J=17)
3D Human Pose Estimation	3DPW	PA-MPJPE	40.3	ZeDO (S=1,J=17)
3D Human Pose Estimation	3DPW	MPJPE	80.9	ZeDO (Cross Dataset)
3D Human Pose Estimation	3DPW	PA-MPJPE	42.6	ZeDO (Cross Dataset)
Pose Estimation	MPI-INF-3DHP	AUC	65.6	ZeDO (S=50)
Pose Estimation	MPI-INF-3DHP	MPJPE	55.2	ZeDO (S=50)
Pose Estimation	MPI-INF-3DHP	PCK	93	ZeDO (S=50)
Pose Estimation	3DPW	MPJPE	69.7	ZeDO (S=1,J=17)
Pose Estimation	3DPW	PA-MPJPE	40.3	ZeDO (S=1,J=17)
Pose Estimation	3DPW	MPJPE	80.9	ZeDO (Cross Dataset)
Pose Estimation	3DPW	PA-MPJPE	42.6	ZeDO (Cross Dataset)
3D	MPI-INF-3DHP	AUC	65.6	ZeDO (S=50)
3D	MPI-INF-3DHP	MPJPE	55.2	ZeDO (S=50)
3D	MPI-INF-3DHP	PCK	93	ZeDO (S=50)
3D	3DPW	MPJPE	69.7	ZeDO (S=1,J=17)
3D	3DPW	PA-MPJPE	40.3	ZeDO (S=1,J=17)
3D	3DPW	MPJPE	80.9	ZeDO (Cross Dataset)
3D	3DPW	PA-MPJPE	42.6	ZeDO (Cross Dataset)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	AUC	65.6	ZeDO (S=50)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	55.2	ZeDO (S=50)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PCK	93	ZeDO (S=50)
1 Image, 2*2 Stitchi	3DPW	MPJPE	69.7	ZeDO (S=1,J=17)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	40.3	ZeDO (S=1,J=17)
1 Image, 2*2 Stitchi	3DPW	MPJPE	80.9	ZeDO (Cross Dataset)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	42.6	ZeDO (Cross Dataset)

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Abstract

Results

Related Papers

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Abstract

Results

Related Papers