TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/An Effective Deep Network for Head Pose Estimation without...

An Effective Deep Network for Head Pose Estimation without Keypoints

Chien Thai, Viet Tran, Minh Bui, Huong Ninh, Hai Tran

2022-10-25Pose EstimationGaze EstimationKnowledge DistillationHead Pose Estimation
PaperPDF

Abstract

Human head pose estimation is an essential problem in facial analysis in recent years that has a lot of computer vision applications such as gaze estimation, virtual reality, and driver assistance. Because of the importance of the head pose estimation problem, it is necessary to design a compact model to resolve this task in order to reduce the computational cost when deploying on facial analysis-based applications such as large camera surveillance systems, AI cameras while maintaining accuracy. In this work, we propose a lightweight model that effectively addresses the head pose estimation problem. Our approach has two main steps. 1) We first train many teacher models on the synthesis dataset - 300W-LPA to get the head pose pseudo labels. 2) We design an architecture with the ResNet18 backbone and train our proposed model with the ensemble of these pseudo labels via the knowledge distillation process. To evaluate the effectiveness of our model, we use AFLW-2000 and BIWI - two real-world head pose datasets. Experimental results show that our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods. Furthermore, our model has the real-time speed of $\sim$300 FPS when inferring on Tesla V100.

Results

TaskDatasetMetricValueModel
Pose EstimationAFLW2000MAE4.15EHPNet
Pose EstimationBIWIMAE (trained with BIWI data)3.43EHPNet
3DAFLW2000MAE4.15EHPNet
3DBIWIMAE (trained with BIWI data)3.43EHPNet
1 Image, 2*2 StitchiAFLW2000MAE4.15EHPNet
1 Image, 2*2 StitchiBIWIMAE (trained with BIWI data)3.43EHPNet

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16