Towards Accurate Facial Landmark Detection via Cascaded Transformers

Hui Li, Zidong Guo, Seon-Min Rhee, Seungju Han, Jae-Joon Han

2022-08-23CVPR 2022 1Face Alignment Facial Landmark Detection

Abstract

Accurate facial landmarks are essential prerequisites for many tasks related to human faces. In this paper, an accurate facial landmark detector is proposed based on cascaded transformers. We formulate facial landmark detection as a coordinate regression task such that the model can be trained end-to-end. With self-attention in transformers, our model can inherently exploit the structured relationships between landmarks, which would benefit landmark detection under challenging conditions such as large pose and occlusion. During cascaded refinement, our model is able to extract the most relevant image features around the target landmark for coordinate prediction, based on deformable attention mechanism, thus bringing more accurate alignment. In addition, we propose a novel decoder that refines image features and landmark positions simultaneously. With few parameter increasing, the detection performance improves further. Our model achieves new state-of-the-art performance on several standard facial landmark detection benchmarks, and shows good generalization ability in cross-dataset evaluation.

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
Facial Recognition and Modelling	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	4.05	DTLD+
Facial Recognition and Modelling	300W Split 2	AUC@7 (box)	70.9	DTLD-s
Facial Recognition and Modelling	300W Split 2	NME (box)	2.05	DTLD-s
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
Face Reconstruction	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
Face Reconstruction	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
Face Reconstruction	300W Split 2	AUC@7 (box)	70.9	DTLD-s
Face Reconstruction	300W Split 2	NME (box)	2.05	DTLD-s
Face Reconstruction	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
Face Reconstruction	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
Face Reconstruction	WFLW	NME (inter-ocular)	4.05	DTLD+
3D	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D	300W Split 2	NME (box)	2.05	DTLD-s
3D	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Modelling	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D Face Modelling	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D Face Modelling	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Modelling	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D Face Modelling	300W Split 2	NME (box)	2.05	DTLD-s
3D Face Reconstruction	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D Face Reconstruction	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D Face Reconstruction	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Reconstruction	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D Face Reconstruction	300W Split 2	NME (box)	2.05	DTLD-s

Abstract

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
Facial Recognition and Modelling	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	4.05	DTLD+
Facial Recognition and Modelling	300W Split 2	AUC@7 (box)	70.9	DTLD-s
Facial Recognition and Modelling	300W Split 2	NME (box)	2.05	DTLD-s
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
Face Reconstruction	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
Face Reconstruction	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
Face Reconstruction	300W Split 2	AUC@7 (box)	70.9	DTLD-s
Face Reconstruction	300W Split 2	NME (box)	2.05	DTLD-s
Face Reconstruction	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
Face Reconstruction	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
Face Reconstruction	WFLW	NME (inter-ocular)	4.05	DTLD+
3D	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D	300W Split 2	NME (box)	2.05	DTLD-s
3D	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Modelling	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D Face Modelling	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D Face Modelling	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D Face Modelling	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Modelling	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D Face Modelling	300W Split 2	NME (box)	2.05	DTLD-s
3D Face Reconstruction	AFLW-19	NME_diag (%, Full)	1.37	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	4.48	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	2.6	DTLD+
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	2.96	DTLD+
3D Face Reconstruction	WFLW	FR@10 (inter-ocular)	2.68	DTLD+
3D Face Reconstruction	WFLW	NME (inter-ocular)	4.05	DTLD+
3D Face Reconstruction	300W Split 2	AUC@7 (box)	70.9	DTLD-s
3D Face Reconstruction	300W Split 2	NME (box)	2.05	DTLD-s

Towards Accurate Facial Landmark Detection via Cascaded Transformers

Abstract

Results

Related Papers

Towards Accurate Facial Landmark Detection via Cascaded Transformers

Abstract

Results

Related Papers