Monocular, One-stage, Regression of Multiple 3D People

Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei

2020-08-27ICCV 2021 103D Human Pose Estimation regression Multi-Person Pose Estimation 3D Depth Estimation 3D Multi-Person Pose Estimation 3D Multi-Person Mesh Recovery

Paper PDF Code(official)Code

Abstract

This paper focuses on the regression of multiple 3D people from a single RGB image. Existing approaches predominantly follow a multi-stage pipeline that first detects people in bounding boxes and then independently regresses their 3D body meshes. In contrast, we propose to Regress all meshes in a One-stage fashion for Multiple 3D People (termed ROMP). The approach is conceptually simple, bounding box-free, and able to learn a per-pixel representation in an end-to-end manner. Our method simultaneously predicts a Body Center heatmap and a Mesh Parameter map, which can jointly describe the 3D body mesh on the pixel level. Through a body-center-guided sampling process, the body mesh parameters of all people in the image are easily extracted from the Mesh Parameter map. Equipped with such a fine-grained representation, our one-stage framework is free of the complex multi-stage process and more robust to occlusion. Compared with state-of-the-art methods, ROMP achieves superior performance on the challenging multi-person benchmarks, including 3DPW and CMU Panoptic. Experiments on crowded/occluded datasets demonstrate the robustness under various types of occlusion. The released code is the first real-time implementation of monocular multi-person 3D mesh regression.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	Relative Human	PCDR	54.84	ROMP
Depth Estimation	Relative Human	PCDR-Adult	55.34	ROMP
Depth Estimation	Relative Human	PCDR-Baby	30.08	ROMP
Depth Estimation	Relative Human	PCDR-Kid	48.41	ROMP
Depth Estimation	Relative Human	PCDR-Teen	51.12	ROMP
Depth Estimation	Relative Human	mPCDK	0.866	ROMP
3D Human Pose Estimation	EMDB	Average MPJAE (deg)	26.5975	ROMP
3D Human Pose Estimation	EMDB	Average MPJAE-PA (deg)	23.9901	ROMP
3D Human Pose Estimation	EMDB	Average MPJPE (mm)	112.652	ROMP
3D Human Pose Estimation	EMDB	Average MPJPE-PA (mm)	75.1869	ROMP
3D Human Pose Estimation	EMDB	Average MVE (mm)	134.863	ROMP
3D Human Pose Estimation	EMDB	Average MVE-PA (mm)	90.648	ROMP
3D Human Pose Estimation	EMDB	Jitter (10m/s^3)	71.2556	ROMP
3D Human Pose Estimation	Panoptic	Average MPJPE (mm)	127.6	ROMP (ResNet-50)
3D Human Pose Estimation	3D Poses in the Wild Challenge	MPJPE	81.76	ROMP
3D Human Pose Estimation	Relative Human	PCDR	68.27	ROMP
Pose Estimation	EMDB	Average MPJAE (deg)	26.5975	ROMP
Pose Estimation	EMDB	Average MPJAE-PA (deg)	23.9901	ROMP
Pose Estimation	EMDB	Average MPJPE (mm)	112.652	ROMP
Pose Estimation	EMDB	Average MPJPE-PA (mm)	75.1869	ROMP
Pose Estimation	EMDB	Average MVE (mm)	134.863	ROMP
Pose Estimation	EMDB	Average MVE-PA (mm)	90.648	ROMP
Pose Estimation	EMDB	Jitter (10m/s^3)	71.2556	ROMP
Pose Estimation	Panoptic	Average MPJPE (mm)	127.6	ROMP (ResNet-50)
Pose Estimation	3D Poses in the Wild Challenge	MPJPE	81.76	ROMP
Pose Estimation	Relative Human	PCDR	68.27	ROMP
Pose Estimation	CrowdPose	mAP @0.5:0.95	58.6	ROMP+CAR
Pose Estimation	CrowdPose	mAP @0.5:0.95	55.6	ROMP
3D	EMDB	Average MPJAE (deg)	26.5975	ROMP
3D	EMDB	Average MPJAE-PA (deg)	23.9901	ROMP
3D	EMDB	Average MPJPE (mm)	112.652	ROMP
3D	EMDB	Average MPJPE-PA (mm)	75.1869	ROMP
3D	EMDB	Average MVE (mm)	134.863	ROMP
3D	EMDB	Average MVE-PA (mm)	90.648	ROMP
3D	EMDB	Jitter (10m/s^3)	71.2556	ROMP
3D	Panoptic	Average MPJPE (mm)	127.6	ROMP (ResNet-50)
3D	3D Poses in the Wild Challenge	MPJPE	81.76	ROMP
3D	Relative Human	PCDR	68.27	ROMP
3D	CrowdPose	mAP @0.5:0.95	58.6	ROMP+CAR
3D	CrowdPose	mAP @0.5:0.95	55.6	ROMP
3D	Relative Human	PCDR	54.84	ROMP
3D	Relative Human	PCDR-Adult	55.34	ROMP
3D	Relative Human	PCDR-Baby	30.08	ROMP
3D	Relative Human	PCDR-Kid	48.41	ROMP
3D	Relative Human	PCDR-Teen	51.12	ROMP
3D	Relative Human	mPCDK	0.866	ROMP
3D Multi-Person Pose Estimation	Relative Human	PCDR	68.27	ROMP
3D Depth Estimation	Relative Human	PCDR	54.84	ROMP
3D Depth Estimation	Relative Human	PCDR-Adult	55.34	ROMP
3D Depth Estimation	Relative Human	PCDR-Baby	30.08	ROMP
3D Depth Estimation	Relative Human	PCDR-Kid	48.41	ROMP
3D Depth Estimation	Relative Human	PCDR-Teen	51.12	ROMP
3D Depth Estimation	Relative Human	mPCDK	0.866	ROMP
Multi-Person Pose Estimation	CrowdPose	mAP @0.5:0.95	58.6	ROMP+CAR
Multi-Person Pose Estimation	CrowdPose	mAP @0.5:0.95	55.6	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MPJAE (deg)	26.5975	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MPJAE-PA (deg)	23.9901	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MPJPE (mm)	112.652	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MPJPE-PA (mm)	75.1869	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MVE (mm)	134.863	ROMP
1 Image, 2*2 Stitchi	EMDB	Average MVE-PA (mm)	90.648	ROMP
1 Image, 2*2 Stitchi	EMDB	Jitter (10m/s^3)	71.2556	ROMP
1 Image, 2*2 Stitchi	Panoptic	Average MPJPE (mm)	127.6	ROMP (ResNet-50)
1 Image, 2*2 Stitchi	3D Poses in the Wild Challenge	MPJPE	81.76	ROMP
1 Image, 2*2 Stitchi	Relative Human	PCDR	68.27	ROMP
1 Image, 2*2 Stitchi	CrowdPose	mAP @0.5:0.95	58.6	ROMP+CAR
1 Image, 2*2 Stitchi	CrowdPose	mAP @0.5:0.95	55.6	ROMP

Monocular, One-stage, Regression of Multiple 3D People

Abstract

Results

Related Papers

Monocular, One-stage, Regression of Multiple 3D People

Abstract

Results

Related Papers