3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models

Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, Chunhua Shen

2024-03-173D Human Pose Estimation 3D human pose and shape estimation 3D Human Reconstruction

Abstract

In this work, we show that synthetic data created by generative models is complementary to computer graphics (CG) rendered data for achieving remarkable generalization performance on diverse real-world scenes for 3D human pose and shape estimation (HPS). Specifically, we propose an effective approach based on recent diffusion models, termed HumanWild, which can effortlessly generate human images and corresponding 3D mesh annotations. We first collect a large-scale human-centric dataset with comprehensive annotations, e.g., text captions and surface normal images. Then, we train a customized ControlNet model upon this dataset to generate diverse human images and initial ground-truth labels. At the core of this step is that we can easily obtain numerous surface normal images from a 3D human parametric model, e.g., SMPL-X, by rendering the 3D mesh onto the image plane. As there exists inevitable noise in the initial labels, we then apply an off-the-shelf foundation segmentation model, i.e., SAM, to filter negative data samples. Our data generation pipeline is flexible and customizable to facilitate different real-world tasks, e.g., ego-centric scenes and perspective-distortion scenes. The generated dataset comprises 0.79M images with corresponding 3D annotations, covering versatile viewpoints, scenes, and human identities. We train various HPS regressors on top of the generated data and evaluate them on a wide range of benchmarks (3DPW, RICH, EgoBody, AGORA, SSP-3D) to verify the effectiveness of the generated data. By exclusively employing generative models, we generate large-scale in-the-wild human images and high-quality annotations, eliminating the need for real-world data collection.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	MPJPE	87.3	CLIFF
3D Human Pose Estimation	3DPW	MPVPE	102.1	CLIFF
3D Human Pose Estimation	3DPW	PA-MPJPE	52.7	CLIFF
Pose Estimation	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	MPJPE	87.3	CLIFF
Pose Estimation	3DPW	MPVPE	102.1	CLIFF
Pose Estimation	3DPW	PA-MPJPE	52.7	CLIFF
3D	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	MPJPE	87.3	CLIFF
3D	3DPW	MPVPE	102.1	CLIFF
3D	3DPW	PA-MPJPE	52.7	CLIFF
1 Image, 2*2 Stitchi	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	MPJPE	87.3	CLIFF
1 Image, 2*2 Stitchi	3DPW	MPVPE	102.1	CLIFF
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	52.7	CLIFF

Abstract

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D Human Pose Estimation	3DPW	MPJPE	87.3	CLIFF
3D Human Pose Estimation	3DPW	MPVPE	102.1	CLIFF
3D Human Pose Estimation	3DPW	PA-MPJPE	52.7	CLIFF
Pose Estimation	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
Pose Estimation	3DPW	MPJPE	87.3	CLIFF
Pose Estimation	3DPW	MPVPE	102.1	CLIFF
Pose Estimation	3DPW	PA-MPJPE	52.7	CLIFF
3D	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
3D	3DPW	MPJPE	87.3	CLIFF
3D	3DPW	MPVPE	102.1	CLIFF
3D	3DPW	PA-MPJPE	52.7	CLIFF
1 Image, 2*2 Stitchi	3DPW	MPJPE	65.2	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	MPVPE	76.8	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	41.9	CLIFF (3DPW+HumanWild+BEDLAM+AGORA)
1 Image, 2*2 Stitchi	3DPW	MPJPE	87.3	CLIFF
1 Image, 2*2 Stitchi	3DPW	MPVPE	102.1	CLIFF
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	52.7	CLIFF

3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models

Abstract

Results

Related Papers

3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models

Abstract

Results

Related Papers