TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SPEC: Seeing People in the Wild with an Estimated Camera

SPEC: Seeing People in the Wild with an Estimated Camera

Muhammed Kocabas, Chun-Hao P. Huang, Joachim Tesch, Lea Müller, Otmar Hilliges, Michael J. Black

2021-10-01ICCV 2021 103D Human Pose EstimationCamera Calibration3D Multi-Person Pose Estimation
PaperPDFCode(official)

Abstract

Due to the lack of camera parameter information for in-the-wild images, existing 3D human pose and shape (HPS) estimation methods make several simplifying assumptions: weak-perspective projection, large constant focal length, and zero camera rotation. These assumptions often do not hold and we show, quantitatively and qualitatively, that they cause errors in the reconstructed 3D shape and pose. To address this, we introduce SPEC, the first in-the-wild 3D HPS method that estimates the perspective camera from a single image and employs this to reconstruct 3D human bodies more accurately. First, we train a neural network to estimate the field of view, camera pitch, and roll given an input image. We employ novel losses that improve the calibration accuracy over previous work. We then train a novel network that concatenates the camera calibration to the image features and uses these together to regress 3D body shape and pose. SPEC is more accurate than the prior art on the standard benchmark (3DPW) as well as two new datasets with more challenging camera views and varying focal lengths. Specifically, we create a new photorealistic synthetic dataset (SPEC-SYN) with ground truth 3D bodies and a novel in-the-wild dataset (SPEC-MTP) with calibration and high-quality reference bodies. Both qualitative and quantitative analysis confirm that knowing camera parameters during inference regresses better human bodies. Code and datasets are available for research purposes at https://spec.is.tue.mpg.de.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationAGORAB-MPJPE112.3SPEC
3D Human Pose EstimationAGORAB-MVE106.5SPEC
3D Human Pose EstimationAGORAB-NMJE133.7SPEC
3D Human Pose EstimationAGORAB-NMVE126.8SPEC
3D Human Pose EstimationSPEC-MTPW-MPJPE124.3SPEC
3D Human Pose EstimationSPEC-MTPW-PVE147.1SPEC
3D Human Pose Estimation3DPWPA-MPJPE53.2SPEC
3D Human Pose EstimationAGORAB-MPJPE112.3SPEC
3D Human Pose EstimationAGORAB-MVE106.5SPEC
3D Human Pose EstimationAGORAB-NMJE133.7SPEC
3D Human Pose EstimationAGORAB-NMVE126.8SPEC
Pose EstimationAGORAB-MPJPE112.3SPEC
Pose EstimationAGORAB-MVE106.5SPEC
Pose EstimationAGORAB-NMJE133.7SPEC
Pose EstimationAGORAB-NMVE126.8SPEC
Pose EstimationSPEC-MTPW-MPJPE124.3SPEC
Pose EstimationSPEC-MTPW-PVE147.1SPEC
Pose Estimation3DPWPA-MPJPE53.2SPEC
Pose EstimationAGORAB-MPJPE112.3SPEC
Pose EstimationAGORAB-MVE106.5SPEC
Pose EstimationAGORAB-NMJE133.7SPEC
Pose EstimationAGORAB-NMVE126.8SPEC
3DAGORAB-MPJPE112.3SPEC
3DAGORAB-MVE106.5SPEC
3DAGORAB-NMJE133.7SPEC
3DAGORAB-NMVE126.8SPEC
3DSPEC-MTPW-MPJPE124.3SPEC
3DSPEC-MTPW-PVE147.1SPEC
3D3DPWPA-MPJPE53.2SPEC
3DAGORAB-MPJPE112.3SPEC
3DAGORAB-MVE106.5SPEC
3DAGORAB-NMJE133.7SPEC
3DAGORAB-NMVE126.8SPEC
3D Multi-Person Pose EstimationAGORAB-MPJPE112.3SPEC
3D Multi-Person Pose EstimationAGORAB-MVE106.5SPEC
3D Multi-Person Pose EstimationAGORAB-NMJE133.7SPEC
3D Multi-Person Pose EstimationAGORAB-NMVE126.8SPEC
1 Image, 2*2 StitchiAGORAB-MPJPE112.3SPEC
1 Image, 2*2 StitchiAGORAB-MVE106.5SPEC
1 Image, 2*2 StitchiAGORAB-NMJE133.7SPEC
1 Image, 2*2 StitchiAGORAB-NMVE126.8SPEC
1 Image, 2*2 StitchiSPEC-MTPW-MPJPE124.3SPEC
1 Image, 2*2 StitchiSPEC-MTPW-PVE147.1SPEC
1 Image, 2*2 Stitchi3DPWPA-MPJPE53.2SPEC
1 Image, 2*2 StitchiAGORAB-MPJPE112.3SPEC
1 Image, 2*2 StitchiAGORAB-MVE106.5SPEC
1 Image, 2*2 StitchiAGORAB-NMJE133.7SPEC
1 Image, 2*2 StitchiAGORAB-NMVE126.8SPEC

Related Papers

Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images2025-06-24Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping2025-06-20Camera Calibration via Circular Patterns: A Comprehensive Framework with Measurement Uncertainty and Unbiased Projection Model2025-06-20ExtPose: Robust and Coherent Pose Estimation by Extending ViTs2025-06-18PoseGRAF: Geometric-Reinforced Adaptive Fusion for Monocular 3D Human Pose Estimation2025-06-17ZeroVO: Visual Odometry with Minimal Assumptions2025-06-09Learning Pyramid-structured Long-range Dependencies for 3D Human Pose Estimation2025-06-03Multi-Spectral Gaussian Splatting with Neural Color Representation2025-06-03