A generic diffusion-based approach for 3D human pose prediction in the wild

Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi

2022-10-11Denoising Human Pose Forecasting Prediction Pose Prediction Missing Elements

Paper PDF Code(official)

Abstract

Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs. The code is available online: \url{https://github.com/vita-epfl/DePOSit}.

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	AMASS	FDE@1000ms (mm)	66.7	TCD
Pose Estimation	AMASS	FDE@560ms (mm)	49.8	TCD
Pose Estimation	AMASS	FDE@720ms (mm)	54.5	TCD
Pose Estimation	AMASS	FDE@880ms (mm)	60.1	TCD
Pose Estimation	Human3.6M	ADE	356	TCD
Pose Estimation	Human3.6M	APD	19466	TCD
Pose Estimation	Human3.6M	FDE	396	TCD
Pose Estimation	Human3.6M	MMADE	463	TCD
Pose Estimation	Human3.6M	MMFDE	445	TCD
Pose Estimation	HumanEva-I	ADE@2000ms	199	TCD
Pose Estimation	HumanEva-I	APD@2000ms	6764	TCD
Pose Estimation	HumanEva-I	FDE@2000ms	215	TCD
Pose Estimation	3DPW	FDE@1000ms (mm)	73.4	TCD
Pose Estimation	3DPW	FDE@560ms (mm)	55.4	TCD
Pose Estimation	3DPW	FDE@720ms (mm)	61.6	TCD
Pose Estimation	3DPW	FDE@880ms (mm)	67.9	TCD
3D	AMASS	FDE@1000ms (mm)	66.7	TCD
3D	AMASS	FDE@560ms (mm)	49.8	TCD
3D	AMASS	FDE@720ms (mm)	54.5	TCD
3D	AMASS	FDE@880ms (mm)	60.1	TCD
3D	Human3.6M	ADE	356	TCD
3D	Human3.6M	APD	19466	TCD
3D	Human3.6M	FDE	396	TCD
3D	Human3.6M	MMADE	463	TCD
3D	Human3.6M	MMFDE	445	TCD
3D	HumanEva-I	ADE@2000ms	199	TCD
3D	HumanEva-I	APD@2000ms	6764	TCD
3D	HumanEva-I	FDE@2000ms	215	TCD
3D	3DPW	FDE@1000ms (mm)	73.4	TCD
3D	3DPW	FDE@560ms (mm)	55.4	TCD
3D	3DPW	FDE@720ms (mm)	61.6	TCD
3D	3DPW	FDE@880ms (mm)	67.9	TCD
1 Image, 2*2 Stitchi	AMASS	FDE@1000ms (mm)	66.7	TCD
1 Image, 2*2 Stitchi	AMASS	FDE@560ms (mm)	49.8	TCD
1 Image, 2*2 Stitchi	AMASS	FDE@720ms (mm)	54.5	TCD
1 Image, 2*2 Stitchi	AMASS	FDE@880ms (mm)	60.1	TCD
1 Image, 2*2 Stitchi	Human3.6M	ADE	356	TCD
1 Image, 2*2 Stitchi	Human3.6M	APD	19466	TCD
1 Image, 2*2 Stitchi	Human3.6M	FDE	396	TCD
1 Image, 2*2 Stitchi	Human3.6M	MMADE	463	TCD
1 Image, 2*2 Stitchi	Human3.6M	MMFDE	445	TCD
1 Image, 2*2 Stitchi	HumanEva-I	ADE@2000ms	199	TCD
1 Image, 2*2 Stitchi	HumanEva-I	APD@2000ms	6764	TCD
1 Image, 2*2 Stitchi	HumanEva-I	FDE@2000ms	215	TCD
1 Image, 2*2 Stitchi	3DPW	FDE@1000ms (mm)	73.4	TCD
1 Image, 2*2 Stitchi	3DPW	FDE@560ms (mm)	55.4	TCD
1 Image, 2*2 Stitchi	3DPW	FDE@720ms (mm)	61.6	TCD
1 Image, 2*2 Stitchi	3DPW	FDE@880ms (mm)	67.9	TCD

A generic diffusion-based approach for 3D human pose prediction in the wild

Abstract

Results

Related Papers

A generic diffusion-based approach for 3D human pose prediction in the wild

Abstract

Results

Related Papers