Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao, Xiaowei Zhou
Recovering multi-person 3D poses with absolute scales from a single RGB image is a challenging problem due to the inherent depth and scale ambiguity from a single view. Addressing this ambiguity requires to aggregate various cues over the entire image, such as body sizes, scene layouts, and inter-person relationships. However, most previous methods adopt a top-down scheme that first performs 2D pose detection and then regresses the 3D pose and scale for each detected person individually, ignoring global contextual cues. In this paper, we propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm. Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both 3D and 2D pose estimation. The experiments demonstrate that the proposed approach achieves the state-of-the-art performance on the CMU Panoptic and MuPoTS-3D datasets and is applicable to in-the-wild videos.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| 3D Multi-Person Pose Estimation (root-relative) | MuPoTS-3D | 3DPCK | 73.5 | SMAP |
| 3D Human Pose Estimation | Panoptic | Average MPJPE (mm) | 61.8 | SMAP |
| 3D Human Pose Estimation | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| 3D Human Pose Estimation | MuPoTS-3D | 3DPCK | 73.5 | SMAP |
| 3D Multi-Person Pose Estimation (absolute) | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| Pose Estimation | Panoptic | Average MPJPE (mm) | 61.8 | SMAP |
| Pose Estimation | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| Pose Estimation | MuPoTS-3D | 3DPCK | 73.5 | SMAP |
| 3D | Panoptic | Average MPJPE (mm) | 61.8 | SMAP |
| 3D | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| 3D | MuPoTS-3D | 3DPCK | 73.5 | SMAP |
| 3D Multi-Person Pose Estimation | Panoptic | Average MPJPE (mm) | 61.8 | SMAP |
| 3D Multi-Person Pose Estimation | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| 3D Multi-Person Pose Estimation | MuPoTS-3D | 3DPCK | 73.5 | SMAP |
| 1 Image, 2*2 Stitchi | Panoptic | Average MPJPE (mm) | 61.8 | SMAP |
| 1 Image, 2*2 Stitchi | MuPoTS-3D | 3DPCK | 35.4 | SMAP |
| 1 Image, 2*2 Stitchi | MuPoTS-3D | 3DPCK | 73.5 | SMAP |