Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

Zhengyao Wen, Wenzhong Lin, Tao Wang, Ge Xu

2021-09-15Facial Expression Recognition Facial Expression Recognition (FER)

Abstract

We present a novel facial expression recognition network, called Distract your Attention Network (DAN). Our method is based on two key observations. Firstly, multiple classes share inherently similar underlying facial appearance, and their differences could be subtle. Secondly, facial expressions exhibit themselves through multiple facial regions simultaneously, and the recognition requires a holistic approach by encoding high-order interactions among local features. To address these issues, we propose our DAN with three key components: Feature Clustering Network (FCN), Multi-head cross Attention Network (MAN), and Attention Fusion Network (AFN). The FCN extracts robust features by adopting a large-margin learning objective to maximize class separability. In addition, the MAN instantiates a number of attention heads to simultaneously attend to multiple facial areas and build attention maps on these regions. Further, the AFN distracts these attentions to multiple locations before fusing the attention maps to a comprehensive one. Extensive experiments on three public datasets (including AffectNet, RAF-DB, and SFEW 2.0) verified that the proposed method consistently achieves state-of-the-art facial expression recognition performance. Code will be made available at https://github.com/yaoing/DAN.

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	RAF-DB	Overall Accuracy	89.7	DAN
Facial Recognition and Modelling	AffectNet	Accuracy (7 emotion)	65.69	DAN
Facial Recognition and Modelling	AffectNet	Accuracy (8 emotion)	62.09	DAN
Face Reconstruction	RAF-DB	Overall Accuracy	89.7	DAN
Face Reconstruction	AffectNet	Accuracy (7 emotion)	65.69	DAN
Face Reconstruction	AffectNet	Accuracy (8 emotion)	62.09	DAN
Facial Expression Recognition (FER)	RAF-DB	Overall Accuracy	89.7	DAN
Facial Expression Recognition (FER)	AffectNet	Accuracy (7 emotion)	65.69	DAN
Facial Expression Recognition (FER)	AffectNet	Accuracy (8 emotion)	62.09	DAN
3D	RAF-DB	Overall Accuracy	89.7	DAN
3D	AffectNet	Accuracy (7 emotion)	65.69	DAN
3D	AffectNet	Accuracy (8 emotion)	62.09	DAN
3D Face Modelling	RAF-DB	Overall Accuracy	89.7	DAN
3D Face Modelling	AffectNet	Accuracy (7 emotion)	65.69	DAN
3D Face Modelling	AffectNet	Accuracy (8 emotion)	62.09	DAN
3D Face Reconstruction	RAF-DB	Overall Accuracy	89.7	DAN
3D Face Reconstruction	AffectNet	Accuracy (7 emotion)	65.69	DAN
3D Face Reconstruction	AffectNet	Accuracy (8 emotion)	62.09	DAN

Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

Abstract

Results

Related Papers

Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

Abstract

Results

Related Papers