TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/WHENet: Real-time Fine-Grained Estimation for Wide Range H...

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Yijun Zhou, James Gregson

2020-05-20Pose EstimationHead Pose Estimation
PaperPDFCodeCode(official)CodeCode

Abstract

We present an end-to-end head-pose estimation network designed to predict Euler angles through the full range head yaws from a single RGB image. Existing methods perform well for frontal views but few target head pose from all viewpoints. This has applications in autonomous driving and retail. Our network builds on multi-loss approaches with changes to loss functions and training strategies adapted to wide range estimation. Additionally, we extract ground truth labelings of anterior views from a current panoptic dataset for the first time. The resulting Wide Headpose Estimation Network (WHENet) is the first fine-grained modern method applicable to the full-range of head yaws (hence wide) yet also meets or beats state-of-the-art methods for frontal head pose estimation. Our network is compact and efficient for mobile devices and applications.

Results

TaskDatasetMetricValueModel
Pose EstimationPanopticGeodesic Error (GE)24.38WHENET
Pose EstimationAFLW2000MAE4.83WHENet-V
Pose EstimationAFLW2000MAE5.42WHENet
Pose EstimationBIWIMAE (trained with other data)3.48WHENet-V
Pose EstimationBIWIMAE (trained with other data)3.81WHENet
3DPanopticGeodesic Error (GE)24.38WHENET
3DAFLW2000MAE4.83WHENet-V
3DAFLW2000MAE5.42WHENet
3DBIWIMAE (trained with other data)3.48WHENet-V
3DBIWIMAE (trained with other data)3.81WHENet
1 Image, 2*2 StitchiPanopticGeodesic Error (GE)24.38WHENET
1 Image, 2*2 StitchiAFLW2000MAE4.83WHENet-V
1 Image, 2*2 StitchiAFLW2000MAE5.42WHENet
1 Image, 2*2 StitchiBIWIMAE (trained with other data)3.48WHENet-V
1 Image, 2*2 StitchiBIWIMAE (trained with other data)3.81WHENet

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16