TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Coherent Reconstruction of Multiple Humans from a Single I...

Coherent Reconstruction of Multiple Humans from a Single Image

Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

2020-06-15CVPR 2020 6Semantic SegmentationPose Estimation3D Depth Estimation3D ReconstructionInstance Segmentation3D Human Reconstruction3D Pose Estimation
PaperPDFCode

Abstract

In this work, we address the problem of multi-person 3D pose estimation from a single image. A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently. However, this type of prediction suffers from incoherent results, e.g., interpenetration and inconsistent depth ordering between the people in the scene. Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene. To this end, a key design choice is the incorporation of the SMPL parametric body model in our top-down framework, which enables the use of two novel losses. First, a distance field-based collision loss penalizes interpenetration among the reconstructed people. Second, a depth ordering-aware loss reasons about occlusions and promotes a depth ordering of people that leads to a rendering which is consistent with the annotated instance segmentation. This provides depth supervision signals to the network, even if the image has no explicit 3D annotations. The experiments show that our approach outperforms previous methods on standard 3D pose benchmarks, while our proposed losses enable more coherent reconstruction in natural images. The project website with videos, results, and code can be found at: https://jiangwenpl.github.io/multiperson

Results

TaskDatasetMetricValueModel
Depth EstimationRelative HumanPCDR54.83CRMH
Depth EstimationRelative HumanPCDR-Adult55.47CRMH
Depth EstimationRelative HumanPCDR-Baby34.74CRMH
Depth EstimationRelative HumanPCDR-Kid48.37CRMH
Depth EstimationRelative HumanPCDR-Teen59.11CRMH
Depth EstimationRelative HumanmPCDK0.781CRMH
3DRelative HumanPCDR54.83CRMH
3DRelative HumanPCDR-Adult55.47CRMH
3DRelative HumanPCDR-Baby34.74CRMH
3DRelative HumanPCDR-Kid48.37CRMH
3DRelative HumanPCDR-Teen59.11CRMH
3DRelative HumanmPCDK0.781CRMH
3D Depth EstimationRelative HumanPCDR54.83CRMH
3D Depth EstimationRelative HumanPCDR-Adult55.47CRMH
3D Depth EstimationRelative HumanPCDR-Baby34.74CRMH
3D Depth EstimationRelative HumanPCDR-Kid48.37CRMH
3D Depth EstimationRelative HumanPCDR-Teen59.11CRMH
3D Depth EstimationRelative HumanmPCDK0.781CRMH

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17