ASMNet: a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

Ali Pourramezan Fard, Hojjat Abdollahi, Mohammad Mahoor

2021-02-27Face Alignment Transfer Learning Pose Estimation Head Pose Estimation

Abstract

Active Shape Model (ASM) is a statistical model of object shapes that represents a target structure. ASM can guide machine learning algorithms to fit a set of points representing an object (e.g., face) onto an image. This paper presents a lightweight Convolutional Neural Network (CNN) architecture with a loss function being assisted by ASM for face alignment and estimating head pose in the wild. We use ASM to first guide the network towards learning a smoother distribution of the facial landmark points. Inspired by transfer learning, during the training process, we gradually harden the regression problem and guide the network towards learning the original landmark points distribution. We define multi-tasks in our loss function that are responsible for detecting facial landmark points as well as estimating the face pose. Learning multiple correlated tasks simultaneously builds synergy and improves the performance of individual tasks. We compare the performance of our proposed model called ASMNet with MobileNetV2 (which is about 2 times bigger than ASMNet) in both the face alignment and pose estimation tasks. Experimental results on challenging datasets show that by using the proposed ASM assisted loss function, the ASMNet performance is comparable with MobileNetV2 in the face alignment task. In addition, for face pose estimation, ASMNet performs much better than MobileNetV2. ASMNet achieves an acceptable performance for facial landmark points detection and pose estimation while having a significantly smaller number of parameters and floating-point operations compared to many CNN-based models.

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	9.41	MobileNetV2
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	10.77	ASMNet
Pose Estimation	300W (Full)	MAE pitch (º)	1.8	ASMNet
Pose Estimation	300W (Full)	MAE roll (º)	1.24	ASMNet
Pose Estimation	300W (Full)	MAE yaw (º)	1.62	ASMNet
Pose Estimation	WFLW	MAE mean (º)	2.7	ASMNet
Pose Estimation	WFLW	MAE pitch (º)	2.93	ASMNet
Pose Estimation	WFLW	MAE roll (º)	2.21	ASMNet
Pose Estimation	WFLW	MAE yaw (º)	2.97	ASMNet
Pose Estimation	COFW	MAE pitch (º)	2.72	ASMNet
Pose Estimation	COFW	MAE yaw (º)	2.91	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
Face Reconstruction	WFLW	NME (inter-ocular)	9.41	MobileNetV2
Face Reconstruction	WFLW	NME (inter-ocular)	10.77	ASMNet
3D	300W (Full)	MAE pitch (º)	1.8	ASMNet
3D	300W (Full)	MAE roll (º)	1.24	ASMNet
3D	300W (Full)	MAE yaw (º)	1.62	ASMNet
3D	WFLW	MAE mean (º)	2.7	ASMNet
3D	WFLW	MAE pitch (º)	2.93	ASMNet
3D	WFLW	MAE roll (º)	2.21	ASMNet
3D	WFLW	MAE yaw (º)	2.97	ASMNet
3D	COFW	MAE pitch (º)	2.72	ASMNet
3D	COFW	MAE yaw (º)	2.91	ASMNet
3D	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D	WFLW	NME (inter-ocular)	10.77	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D Face Modelling	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D Face Modelling	WFLW	NME (inter-ocular)	10.77	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D Face Reconstruction	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D Face Reconstruction	WFLW	NME (inter-ocular)	10.77	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE pitch (º)	1.8	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE roll (º)	1.24	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE yaw (º)	1.62	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE mean (º)	2.7	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE pitch (º)	2.93	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE roll (º)	2.21	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE yaw (º)	2.97	ASMNet
1 Image, 2*2 Stitchi	COFW	MAE pitch (º)	2.72	ASMNet
1 Image, 2*2 Stitchi	COFW	MAE yaw (º)	2.91	ASMNet

Abstract

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
Facial Recognition and Modelling	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	9.41	MobileNetV2
Facial Recognition and Modelling	WFLW	NME (inter-ocular)	10.77	ASMNet
Pose Estimation	300W (Full)	MAE pitch (º)	1.8	ASMNet
Pose Estimation	300W (Full)	MAE roll (º)	1.24	ASMNet
Pose Estimation	300W (Full)	MAE yaw (º)	1.62	ASMNet
Pose Estimation	WFLW	MAE mean (º)	2.7	ASMNet
Pose Estimation	WFLW	MAE pitch (º)	2.93	ASMNet
Pose Estimation	WFLW	MAE roll (º)	2.21	ASMNet
Pose Estimation	WFLW	MAE yaw (º)	2.97	ASMNet
Pose Estimation	COFW	MAE pitch (º)	2.72	ASMNet
Pose Estimation	COFW	MAE yaw (º)	2.91	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
Face Reconstruction	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
Face Reconstruction	WFLW	NME (inter-ocular)	9.41	MobileNetV2
Face Reconstruction	WFLW	NME (inter-ocular)	10.77	ASMNet
3D	300W (Full)	MAE pitch (º)	1.8	ASMNet
3D	300W (Full)	MAE roll (º)	1.24	ASMNet
3D	300W (Full)	MAE yaw (º)	1.62	ASMNet
3D	WFLW	MAE mean (º)	2.7	ASMNet
3D	WFLW	MAE pitch (º)	2.93	ASMNet
3D	WFLW	MAE roll (º)	2.21	ASMNet
3D	WFLW	MAE yaw (º)	2.97	ASMNet
3D	COFW	MAE pitch (º)	2.72	ASMNet
3D	COFW	MAE yaw (º)	2.91	ASMNet
3D	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D	WFLW	NME (inter-ocular)	10.77	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D Face Modelling	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D Face Modelling	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D Face Modelling	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D Face Modelling	WFLW	NME (inter-ocular)	10.77	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	7.35	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	3.88	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	4.59	MobileNetV2
3D Face Reconstruction	300W	NME_inter-ocular (%, Challenge)	8.2	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Common)	4.82	ASMNet
3D Face Reconstruction	300W	NME_inter-ocular (%, Full)	5.5	ASMNet
3D Face Reconstruction	WFLW	NME (inter-ocular)	9.41	MobileNetV2
3D Face Reconstruction	WFLW	NME (inter-ocular)	10.77	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE pitch (º)	1.8	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE roll (º)	1.24	ASMNet
1 Image, 2*2 Stitchi	300W (Full)	MAE yaw (º)	1.62	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE mean (º)	2.7	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE pitch (º)	2.93	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE roll (º)	2.21	ASMNet
1 Image, 2*2 Stitchi	WFLW	MAE yaw (º)	2.97	ASMNet
1 Image, 2*2 Stitchi	COFW	MAE pitch (º)	2.72	ASMNet
1 Image, 2*2 Stitchi	COFW	MAE yaw (º)	2.91	ASMNet

ASMNet: a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

Abstract

Results

Related Papers

ASMNet: a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

Abstract

Results

Related Papers