Adrian Bulat, Enrique Sanchez, Georgios Tzimiropoulos
Deep Learning models based on heatmap regression have revolutionized the task of facial landmark localization with existing models working robustly under large poses, non-uniform illumination and shadows, occlusions and self-occlusions, low resolution and blur. However, despite their wide adoption, heatmap regression approaches suffer from discretization-induced errors related to both the heatmap encoding and decoding process. In this work we show that these errors have a surprisingly large negative impact on facial alignment accuracy. To alleviate this problem, we propose a new approach for the heatmap encoding and decoding process by leveraging the underlying continuous distribution. To take full advantage of the newly proposed encoding-decoding mechanism, we also introduce a Siamese-based training that enforces heatmap consistency across various geometric image transformations. Our approach offers noticeable gains across multiple datasets setting a new state-of-the-art result in facial landmark localization. Code alongside the pretrained models will be made available at https://www.adrianbulat.com/face-alignment
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Facial Recognition and Modelling | WFW (Extra Data) | AUC@10 (inter-ocular) | 63.1 | SH-FAN |
| Facial Recognition and Modelling | WFW (Extra Data) | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| Facial Recognition and Modelling | WFW (Extra Data) | NME (inter-ocular) | 3.72 | SH-FAN |
| Facial Recognition and Modelling | COFW-68 (300WLP) | AUC@7 | 64.9 | SH-FAN |
| Facial Recognition and Modelling | COFW-68 (300WLP) | NME (box) | 2.47 | SH-FAN |
| Facial Recognition and Modelling | 300W Split 2 (300W-LP) | AUC@7 (bbox) | 71.1 | SH-FAN |
| Facial Recognition and Modelling | 300W Split 2 (300W-LP) | NME (bbox) | 2.04 | SH-FAN |
| Facial Recognition and Modelling | 300W Split 2 (300W-LP) | NME (inter-ocular) | 2.94 | SH-FAN |
| Facial Recognition and Modelling | AFLW-19 | AUC_box@0.07 (%, Full) | 70 | SHR-FAN |
| Facial Recognition and Modelling | AFLW-19 | NME_box (%, Full) | 2.14 | SHR-FAN |
| Facial Recognition and Modelling | AFLW-19 | NME_diag (%, Frontal) | 1.12 | SHR-FAN |
| Facial Recognition and Modelling | AFLW-19 | NME_diag (%, Full) | 1.31 | SHR-FAN |
| Facial Recognition and Modelling | 300W | NME_inter-ocular (%, Challenge) | 4.13 | SHR-FAN |
| Facial Recognition and Modelling | 300W | NME_inter-ocular (%, Common) | 2.61 | SHR-FAN |
| Facial Recognition and Modelling | 300W | NME_inter-ocular (%, Full) | 2.94 | SHR-FAN |
| Facial Recognition and Modelling | WFLW | AUC@10 (inter-ocular) | 63.81 | SH-FAN |
| Facial Recognition and Modelling | WFLW | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| Facial Recognition and Modelling | WFLW | NME (inter-ocular) | 3.72 | SH-FAN |
| Face Reconstruction | 300W Split 2 (300W-LP) | AUC@7 (bbox) | 71.1 | SH-FAN |
| Face Reconstruction | 300W Split 2 (300W-LP) | NME (bbox) | 2.04 | SH-FAN |
| Face Reconstruction | 300W Split 2 (300W-LP) | NME (inter-ocular) | 2.94 | SH-FAN |
| Face Reconstruction | COFW-68 (300WLP) | AUC@7 | 64.9 | SH-FAN |
| Face Reconstruction | COFW-68 (300WLP) | NME (box) | 2.47 | SH-FAN |
| Face Reconstruction | 300W | NME_inter-ocular (%, Challenge) | 4.13 | SHR-FAN |
| Face Reconstruction | 300W | NME_inter-ocular (%, Common) | 2.61 | SHR-FAN |
| Face Reconstruction | 300W | NME_inter-ocular (%, Full) | 2.94 | SHR-FAN |
| Face Reconstruction | WFW (Extra Data) | AUC@10 (inter-ocular) | 63.1 | SH-FAN |
| Face Reconstruction | WFW (Extra Data) | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| Face Reconstruction | WFW (Extra Data) | NME (inter-ocular) | 3.72 | SH-FAN |
| Face Reconstruction | AFLW-19 | AUC_box@0.07 (%, Full) | 70 | SHR-FAN |
| Face Reconstruction | AFLW-19 | NME_box (%, Full) | 2.14 | SHR-FAN |
| Face Reconstruction | AFLW-19 | NME_diag (%, Frontal) | 1.12 | SHR-FAN |
| Face Reconstruction | AFLW-19 | NME_diag (%, Full) | 1.31 | SHR-FAN |
| Face Reconstruction | WFLW | AUC@10 (inter-ocular) | 63.81 | SH-FAN |
| Face Reconstruction | WFLW | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| Face Reconstruction | WFLW | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D | 300W Split 2 (300W-LP) | AUC@7 (bbox) | 71.1 | SH-FAN |
| 3D | 300W Split 2 (300W-LP) | NME (bbox) | 2.04 | SH-FAN |
| 3D | 300W Split 2 (300W-LP) | NME (inter-ocular) | 2.94 | SH-FAN |
| 3D | COFW-68 (300WLP) | AUC@7 | 64.9 | SH-FAN |
| 3D | COFW-68 (300WLP) | NME (box) | 2.47 | SH-FAN |
| 3D | 300W | NME_inter-ocular (%, Challenge) | 4.13 | SHR-FAN |
| 3D | 300W | NME_inter-ocular (%, Common) | 2.61 | SHR-FAN |
| 3D | 300W | NME_inter-ocular (%, Full) | 2.94 | SHR-FAN |
| 3D | WFW (Extra Data) | AUC@10 (inter-ocular) | 63.1 | SH-FAN |
| 3D | WFW (Extra Data) | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D | WFW (Extra Data) | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D | AFLW-19 | AUC_box@0.07 (%, Full) | 70 | SHR-FAN |
| 3D | AFLW-19 | NME_box (%, Full) | 2.14 | SHR-FAN |
| 3D | AFLW-19 | NME_diag (%, Frontal) | 1.12 | SHR-FAN |
| 3D | AFLW-19 | NME_diag (%, Full) | 1.31 | SHR-FAN |
| 3D | WFLW | AUC@10 (inter-ocular) | 63.81 | SH-FAN |
| 3D | WFLW | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D | WFLW | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D Face Modelling | WFW (Extra Data) | AUC@10 (inter-ocular) | 63.1 | SH-FAN |
| 3D Face Modelling | WFW (Extra Data) | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D Face Modelling | WFW (Extra Data) | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D Face Modelling | COFW-68 (300WLP) | AUC@7 | 64.9 | SH-FAN |
| 3D Face Modelling | COFW-68 (300WLP) | NME (box) | 2.47 | SH-FAN |
| 3D Face Modelling | 300W Split 2 (300W-LP) | AUC@7 (bbox) | 71.1 | SH-FAN |
| 3D Face Modelling | 300W Split 2 (300W-LP) | NME (bbox) | 2.04 | SH-FAN |
| 3D Face Modelling | 300W Split 2 (300W-LP) | NME (inter-ocular) | 2.94 | SH-FAN |
| 3D Face Modelling | AFLW-19 | AUC_box@0.07 (%, Full) | 70 | SHR-FAN |
| 3D Face Modelling | AFLW-19 | NME_box (%, Full) | 2.14 | SHR-FAN |
| 3D Face Modelling | AFLW-19 | NME_diag (%, Frontal) | 1.12 | SHR-FAN |
| 3D Face Modelling | AFLW-19 | NME_diag (%, Full) | 1.31 | SHR-FAN |
| 3D Face Modelling | 300W | NME_inter-ocular (%, Challenge) | 4.13 | SHR-FAN |
| 3D Face Modelling | 300W | NME_inter-ocular (%, Common) | 2.61 | SHR-FAN |
| 3D Face Modelling | 300W | NME_inter-ocular (%, Full) | 2.94 | SHR-FAN |
| 3D Face Modelling | WFLW | AUC@10 (inter-ocular) | 63.81 | SH-FAN |
| 3D Face Modelling | WFLW | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D Face Modelling | WFLW | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D Face Reconstruction | WFW (Extra Data) | AUC@10 (inter-ocular) | 63.1 | SH-FAN |
| 3D Face Reconstruction | WFW (Extra Data) | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D Face Reconstruction | WFW (Extra Data) | NME (inter-ocular) | 3.72 | SH-FAN |
| 3D Face Reconstruction | COFW-68 (300WLP) | AUC@7 | 64.9 | SH-FAN |
| 3D Face Reconstruction | COFW-68 (300WLP) | NME (box) | 2.47 | SH-FAN |
| 3D Face Reconstruction | 300W Split 2 (300W-LP) | AUC@7 (bbox) | 71.1 | SH-FAN |
| 3D Face Reconstruction | 300W Split 2 (300W-LP) | NME (bbox) | 2.04 | SH-FAN |
| 3D Face Reconstruction | 300W Split 2 (300W-LP) | NME (inter-ocular) | 2.94 | SH-FAN |
| 3D Face Reconstruction | AFLW-19 | AUC_box@0.07 (%, Full) | 70 | SHR-FAN |
| 3D Face Reconstruction | AFLW-19 | NME_box (%, Full) | 2.14 | SHR-FAN |
| 3D Face Reconstruction | AFLW-19 | NME_diag (%, Frontal) | 1.12 | SHR-FAN |
| 3D Face Reconstruction | AFLW-19 | NME_diag (%, Full) | 1.31 | SHR-FAN |
| 3D Face Reconstruction | 300W | NME_inter-ocular (%, Challenge) | 4.13 | SHR-FAN |
| 3D Face Reconstruction | 300W | NME_inter-ocular (%, Common) | 2.61 | SHR-FAN |
| 3D Face Reconstruction | 300W | NME_inter-ocular (%, Full) | 2.94 | SHR-FAN |
| 3D Face Reconstruction | WFLW | AUC@10 (inter-ocular) | 63.81 | SH-FAN |
| 3D Face Reconstruction | WFLW | FR@10 (inter-ocular) | 1.55 | SH-FAN |
| 3D Face Reconstruction | WFLW | NME (inter-ocular) | 3.72 | SH-FAN |