TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Putting People in their Place: Monocular Regression of 3D ...

Putting People in their Place: Monocular Regression of 3D People in Depth

Yu Sun, Wu Liu, Qian Bao, Yili Fu, Tao Mei, Michael J. Black

2021-12-15CVPR 2022 1regression3D Depth Estimation
PaperPDFCode(official)CodeCodeCode(official)

Abstract

Given an image with multiple people, our goal is to directly regress the pose and shape of all the people as well as their relative depth. Inferring the depth of a person in an image, however, is fundamentally ambiguous without knowing their height. This is particularly problematic when the scene contains people of very different sizes, e.g. from infants to adults. To solve this, we need several things. First, we develop a novel method to infer the poses and depth of multiple people in a single image. While previous work that estimates multiple people does so by reasoning in the image plane, our method, called BEV, adds an additional imaginary Bird's-Eye-View representation to explicitly reason about depth. BEV reasons simultaneously about body centers in the image and in depth and, by combing these, estimates 3D body position. Unlike prior work, BEV is a single-shot method that is end-to-end differentiable. Second, height varies with age, making it impossible to resolve depth without also estimating the age of people in the image. To do so, we exploit a 3D body model space that lets BEV infer shapes from infants to adults. Third, to train BEV, we need a new dataset. Specifically, we create a "Relative Human" (RH) dataset that includes age labels and relative depth relationships between the people in the images. Extensive experiments on RH and AGORA demonstrate the effectiveness of the model and training scheme. BEV outperforms existing methods on depth reasoning, child shape estimation, and robustness to occlusion. The code and dataset are released for research purposes.

Results

TaskDatasetMetricValueModel
Depth EstimationRelative HumanPCDR68.27BEV
Depth EstimationRelative HumanPCDR-Adult69.71BEV
Depth EstimationRelative HumanPCDR-Baby60.77BEV
Depth EstimationRelative HumanPCDR-Kid67.09BEV
Depth EstimationRelative HumanPCDR-Teen66.07BEV
Depth EstimationRelative HumanmPCDK0.884BEV
3DRelative HumanPCDR68.27BEV
3DRelative HumanPCDR-Adult69.71BEV
3DRelative HumanPCDR-Baby60.77BEV
3DRelative HumanPCDR-Kid67.09BEV
3DRelative HumanPCDR-Teen66.07BEV
3DRelative HumanmPCDK0.884BEV
3D Depth EstimationRelative HumanPCDR68.27BEV
3D Depth EstimationRelative HumanPCDR-Adult69.71BEV
3D Depth EstimationRelative HumanPCDR-Baby60.77BEV
3D Depth EstimationRelative HumanPCDR-Kid67.09BEV
3D Depth EstimationRelative HumanPCDR-Teen66.07BEV
3D Depth EstimationRelative HumanmPCDK0.884BEV

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16Second-Order Bounds for [0,1]-Valued Regression via Betting Loss2025-07-16Sparse Regression Codes exploit Multi-User Diversity without CSI2025-07-15Bradley-Terry and Multi-Objective Reward Modeling Are Complementary2025-07-10Active Learning for Manifold Gaussian Process Regression2025-06-26A Survey of Predictive Maintenance Methods: An Analysis of Prognostics via Classification and Regression2025-06-25