TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SimpleEgo: Predicting Probabilistic Body Pose from Egocent...

SimpleEgo: Predicting Probabilistic Body Pose from Egocentric Cameras

Hanz Cuevas-Velasquez, Charlie Hewitt, Sadegh Aliakbarian, Tadas Baltrušaitis

2024-01-26Egocentric Pose EstimationPose Estimation
PaperPDF

Abstract

Our work addresses the problem of egocentric human pose estimation from downwards-facing cameras on head-mounted devices (HMD). This presents a challenging scenario, as parts of the body often fall outside of the image or are occluded. Previous solutions minimize this problem by using fish-eye camera lenses to capture a wider view, but these can present hardware design issues. They also predict 2D heat-maps per joint and lift them to 3D space to deal with self-occlusions, but this requires large network architectures which are impractical to deploy on resource-constrained HMDs. We predict pose from images captured with conventional rectilinear camera lenses. This resolves hardware design issues, but means body parts are often out of frame. As such, we directly regress probabilistic joint rotations represented as matrix Fisher distributions for a parameterized body model. This allows us to quantify pose uncertainties and explain out-of-frame or occluded joints. This also removes the need to compute 2D heat-maps and allows for simplified DNN architectures which require less compute. Given the lack of egocentric datasets using rectilinear camera lenses, we introduce the SynthEgo dataset, a synthetic dataset with 60K stereo images containing high diversity of pose, shape, clothing and skin tone. Our approach achieves state-of-the-art results for this challenging configuration, reducing mean per-joint position error by 23% overall and 58% for the lower body. Our architecture also has eight times fewer parameters and runs twice as fast as the current state-of-the-art. Experiments show that training on our synthetic dataset leads to good generalization to real world images without fine-tuning.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationUnrealEgoAverage MPJPE (mm)80SimpleEgo
3D Human Pose EstimationUnrealEgoPA-MPJPE67.6SimpleEgo
Pose EstimationUnrealEgoAverage MPJPE (mm)80SimpleEgo
Pose EstimationUnrealEgoPA-MPJPE67.6SimpleEgo
3DUnrealEgoAverage MPJPE (mm)80SimpleEgo
3DUnrealEgoPA-MPJPE67.6SimpleEgo
1 Image, 2*2 StitchiUnrealEgoAverage MPJPE (mm)80SimpleEgo
1 Image, 2*2 StitchiUnrealEgoPA-MPJPE67.6SimpleEgo

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16