TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Generalizing Monocular 3D Human Pose Estimation in the Wild

Generalizing Monocular 3D Human Pose Estimation in the Wild

Luyang Wang, Yan Chen, Zhenhua Guo, Keyuan Qian, Mude Lin, Hongsheng Li, Jimmy S. Ren

2019-04-113D Human Pose EstimationMonocular 3D Human Pose EstimationPose Estimation3D Pose Estimation
PaperPDFCode(official)

Abstract

The availability of the large-scale labeled 3D poses in the Human3.6M dataset plays an important role in advancing the algorithms for 3D human pose estimation from a still image. We observe that recent innovation in this area mainly focuses on new techniques that explicitly address the generalization issue when using this dataset, because this database is constructed in a highly controlled environment with limited human subjects and background variations. Despite such efforts, we can show that the results of the current methods are still error-prone especially when tested against the images taken in-the-wild. In this paper, we aim to tackle this problem from a different perspective. We propose a principled approach to generate high quality 3D pose ground truth given any in-the-wild image with a person inside. We achieve this by first devising a novel stereo inspired neural network to directly map any 2D pose to high quality 3D counterpart. We then perform a carefully designed geometric searching scheme to further refine the joints. Based on this scheme, we build a large-scale dataset with 400,000 in-the-wild images and their corresponding 3D pose ground truth. This enables the training of a high quality neural network model, without specialized training scheme and auxiliary loss function, which performs favorably against the state-of-the-art 3D pose estimation methods. We also evaluate the generalization ability of our model both quantitatively and qualitatively. Results show that our approach convincingly outperforms the previous methods. We make our dataset and code publicly available.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationMPI-INF-3DHPAUC33.8Stereoscopic View Synthesis Subnetwork
3D Human Pose EstimationMPI-INF-3DHPPCK71.2Stereoscopic View Synthesis Subnetwork
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)58Stereoscopic View Synthesis Subnetwork
Pose EstimationMPI-INF-3DHPAUC33.8Stereoscopic View Synthesis Subnetwork
Pose EstimationMPI-INF-3DHPPCK71.2Stereoscopic View Synthesis Subnetwork
Pose EstimationHuman3.6MAverage MPJPE (mm)58Stereoscopic View Synthesis Subnetwork
3DMPI-INF-3DHPAUC33.8Stereoscopic View Synthesis Subnetwork
3DMPI-INF-3DHPPCK71.2Stereoscopic View Synthesis Subnetwork
3DHuman3.6MAverage MPJPE (mm)58Stereoscopic View Synthesis Subnetwork
1 Image, 2*2 StitchiMPI-INF-3DHPAUC33.8Stereoscopic View Synthesis Subnetwork
1 Image, 2*2 StitchiMPI-INF-3DHPPCK71.2Stereoscopic View Synthesis Subnetwork
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)58Stereoscopic View Synthesis Subnetwork

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16