TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/An Efficient Multitask Neural Network for Face Alignment, ...

An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking

Jiahao Xia, Haimin Zhang, Shiping Wen, Shuo Yang, Min Xu

2021-03-13Face AlignmentPose EstimationHead Pose EstimationFace Detection
PaperPDF

Abstract

While Convolutional Neural Networks (CNNs) have significantly boosted the performance of face related algorithms, maintaining accuracy and efficiency simultaneously in practical use remains challenging. The state-of-the-art methods employ deeper networks for better performance, which makes it less practical for mobile applications because of more parameters and higher computational complexity. Therefore, we propose an efficient multitask neural network, Alignment & Tracking & Pose Network (ATPN) for face alignment, face tracking and head pose estimation. Specifically, to achieve better performance with fewer layers for face alignment, we introduce a shortcut connection between shallow-layer and deep-layer features. We find the shallow-layer features are highly correspond to facial boundaries that can provide the structural information of face and it is crucial for face alignment. Moreover, we generate a cheap heatmap based on the face alignment result and fuse it with features to improve the performance of the other two tasks. Based on the heatmap, the network can utilize both geometric information of landmarks and appearance information for head pose estimation. The heatmap also provides attention clues for face tracking. The face tracking task also saves us the face detection procedure for each frame, which also significantly boost the real-time capability for video-based tasks. We experimentally validate ATPN on four benchmark datasets, WFLW, 300VW, WIDER Face and 300W-LP. The experimental results demonstrate that it achieves better performance with much less parameters and lower computational complexity compared to other light models.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingWFLWAUC@10 (inter-ocular)55.7ATPN
Facial Recognition and ModellingWFLWFR@10 (inter-ocular)6.27ATPN
Facial Recognition and ModellingWFLWNME (inter-ocular)5.13ATPN
Face ReconstructionWFLWAUC@10 (inter-ocular)55.7ATPN
Face ReconstructionWFLWFR@10 (inter-ocular)6.27ATPN
Face ReconstructionWFLWNME (inter-ocular)5.13ATPN
3DWFLWAUC@10 (inter-ocular)55.7ATPN
3DWFLWFR@10 (inter-ocular)6.27ATPN
3DWFLWNME (inter-ocular)5.13ATPN
3D Face ModellingWFLWAUC@10 (inter-ocular)55.7ATPN
3D Face ModellingWFLWFR@10 (inter-ocular)6.27ATPN
3D Face ModellingWFLWNME (inter-ocular)5.13ATPN
3D Face ReconstructionWFLWAUC@10 (inter-ocular)55.7ATPN
3D Face ReconstructionWFLWFR@10 (inter-ocular)6.27ATPN
3D Face ReconstructionWFLWNME (inter-ocular)5.13ATPN

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16