Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting

Taeho Kang, Youngki Lee

2024-02-28CVPR 2024 1Egocentric Pose Estimation Pose Estimation 3D Pose Estimation

Abstract

We present EgoTAP, a heatmap-to-3D pose lifting method for highly accurate stereo egocentric 3D pose estimation. Severe self-occlusion and out-of-view limbs in egocentric camera views make accurate pose estimation a challenging problem. To address the challenge, prior methods employ joint heatmaps-probabilistic 2D representations of the body pose, but heatmap-to-3D pose conversion still remains an inaccurate process. We propose a novel heatmap-to-3D lifting method composed of the Grid ViT Encoder and the Propagation Network. The Grid ViT Encoder summarizes joint heatmaps into effective feature embedding using self-attention. Then, the Propagation Network estimates the 3D pose by utilizing skeletal information to better estimate the position of obscure joints. Our method significantly outperforms the previous state-of-the-art qualitatively and quantitatively demonstrated by a 23.9\% reduction of error in an MPJPE metric. Our source code is available in GitHub.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	UnrealEgo	Average MPJPE (mm)	41.1	EgoTAP
3D Human Pose Estimation	UnrealEgo	PA-MPJPE	35.4	EgoTAP
Pose Estimation	UnrealEgo	Average MPJPE (mm)	41.1	EgoTAP
Pose Estimation	UnrealEgo	PA-MPJPE	35.4	EgoTAP
3D	UnrealEgo	Average MPJPE (mm)	41.1	EgoTAP
3D	UnrealEgo	PA-MPJPE	35.4	EgoTAP
1 Image, 2*2 Stitchi	UnrealEgo	Average MPJPE (mm)	41.1	EgoTAP
1 Image, 2*2 Stitchi	UnrealEgo	PA-MPJPE	35.4	EgoTAP

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17 Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17 DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17 From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17 AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17 SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16 SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16 Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16