TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/img2pose: Face Alignment and Detection via 6DoF, Face Pose...

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Vítor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner

2020-12-14CVPR 2021 1Face AlignmentFacial Landmark Detection3D Face AlignmentPose EstimationHead Pose EstimationFace Detection
PaperPDFCode(official)Code

Abstract

We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. We observe that estimating the 6DoF rigid transformation of a face is a simpler problem than facial landmark detection, often used for 3D face alignment. In addition, 6DoF offers more information than face bounding box labels. We leverage these observations to make multiple contributions: (a) We describe an easily trained, efficient, Faster R-CNN--based model which regresses 6DoF pose for all faces in the photo, without preliminary face detection. (b) We explain how pose is converted and kept consistent between the input photo and arbitrary crops created while training and evaluating our model. (c) Finally, we show how face poses can replace detection bounding box training labels. Tests on AFLW2000-3D and BIWI show that our method runs at real-time and outperforms state of the art (SotA) face pose estimators. Remarkably, our method also surpasses SotA models of comparable complexity on the WIDER FACE detection benchmark, despite not been optimized on bounding box labels.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingWIDER Face (Medium)AP0.89img2pose
Facial Recognition and ModellingWIDER Face (Easy)AP0.9img2pose
Facial Recognition and ModellingWIDER Face (Hard)AP0.839img2pose
Pose EstimationAFLW2000Geodesic Error (GE)6.41img2pose
Pose EstimationAFLW2000MAE3.913img2pose
Pose EstimationAFLW2000MAE_t0.099img2pose
Pose EstimationAFLW2000MAE4.839RetinaFace R-50 (5 points)
Pose EstimationAFLW2000MAE_t0.114RetinaFace R-50 (5 points)
Pose EstimationBIWIGeodesic Error (GE)7.1img2pose
Pose EstimationBIWIGeodesic Error - aligned (GE)6.23img2pose
Pose EstimationBIWIMAE (trained with other data)3.786img2pose
Pose EstimationBIWIMAE-aligned (trained with other data)3.4img2pose
Pose EstimationBIWIMAE (trained with other data)4.578RetinaFace R-50 (5 points)
Face DetectionWIDER Face (Medium)AP0.89img2pose
Face DetectionWIDER Face (Easy)AP0.9img2pose
Face DetectionWIDER Face (Hard)AP0.839img2pose
Face ReconstructionWIDER Face (Medium)AP0.89img2pose
Face ReconstructionWIDER Face (Easy)AP0.9img2pose
Face ReconstructionWIDER Face (Hard)AP0.839img2pose
3DAFLW2000Geodesic Error (GE)6.41img2pose
3DAFLW2000MAE3.913img2pose
3DAFLW2000MAE_t0.099img2pose
3DAFLW2000MAE4.839RetinaFace R-50 (5 points)
3DAFLW2000MAE_t0.114RetinaFace R-50 (5 points)
3DBIWIGeodesic Error (GE)7.1img2pose
3DBIWIGeodesic Error - aligned (GE)6.23img2pose
3DBIWIMAE (trained with other data)3.786img2pose
3DBIWIMAE-aligned (trained with other data)3.4img2pose
3DBIWIMAE (trained with other data)4.578RetinaFace R-50 (5 points)
3DWIDER Face (Medium)AP0.89img2pose
3DWIDER Face (Easy)AP0.9img2pose
3DWIDER Face (Hard)AP0.839img2pose
3D Face ModellingWIDER Face (Medium)AP0.89img2pose
3D Face ModellingWIDER Face (Easy)AP0.9img2pose
3D Face ModellingWIDER Face (Hard)AP0.839img2pose
3D Face ReconstructionWIDER Face (Medium)AP0.89img2pose
3D Face ReconstructionWIDER Face (Easy)AP0.9img2pose
3D Face ReconstructionWIDER Face (Hard)AP0.839img2pose
1 Image, 2*2 StitchiAFLW2000Geodesic Error (GE)6.41img2pose
1 Image, 2*2 StitchiAFLW2000MAE3.913img2pose
1 Image, 2*2 StitchiAFLW2000MAE_t0.099img2pose
1 Image, 2*2 StitchiAFLW2000MAE4.839RetinaFace R-50 (5 points)
1 Image, 2*2 StitchiAFLW2000MAE_t0.114RetinaFace R-50 (5 points)
1 Image, 2*2 StitchiBIWIGeodesic Error (GE)7.1img2pose
1 Image, 2*2 StitchiBIWIGeodesic Error - aligned (GE)6.23img2pose
1 Image, 2*2 StitchiBIWIMAE (trained with other data)3.786img2pose
1 Image, 2*2 StitchiBIWIMAE-aligned (trained with other data)3.4img2pose
1 Image, 2*2 StitchiBIWIMAE (trained with other data)4.578RetinaFace R-50 (5 points)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16