Deep Label Distribution Learning with Label Ambiguity

Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, Xin Geng

2016-11-06Age Estimation Semantic Segmentation Pose Estimation General Classification Classification Multi-Label Classification Head Pose Estimation

Paper PDF Code Code(official)

Abstract

Convolutional Neural Networks (ConvNets) have achieved excellent recognition performance in various visual recognition tasks. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient training images with precise labels in some domains such as apparent age estimation, head pose estimation, multi-label classification and semantic segmentation. Fortunately, there is ambiguous information among labels, which makes these tasks different from traditional classification. Based on this observation, we convert the label of each image into a discrete label distribution, and learn the label distribution by minimizing a Kullback-Leibler divergence between the predicted and ground-truth label distributions using deep ConvNets. The proposed DLDL (Deep Label Distribution Learning) method effectively utilizes the label ambiguity in both feature learning and classifier learning, which help prevent the network from over-fitting even when the training set is small. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation. At the same time, it also improves recognition performance for multi-label classification and semantic segmentation tasks.

Results

Task	Dataset	Metric	Value	Model
Facial Recognition and Modelling	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
Facial Recognition and Modelling	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
Facial Recognition and Modelling	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
Semantic Segmentation	PASCAL VOC 2012	Mean IoU	67.1	DLDL-8s+CRF
Semantic Segmentation	PASCAL VOC 2011	Mean IoU	67.6	DLDL-8s+CRF
Pose Estimation	Pointing'04	MAE	4.64	Ours DLDL (KL)
Pose Estimation	AFLW	MAE	9.78	DLDL (KL)
Pose Estimation	BJUT-3D	MAE	0.09	Ours DLDL (KL)
Multi-Label Classification	PASCAL VOC 2012	mAP	92.4	Ours PF-DLDL
Multi-Label Classification	PASCAL VOC 2007	mAP	93.4	Ours PF-DLDL
Face Reconstruction	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
Face Reconstruction	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
Face Reconstruction	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
3D	Pointing'04	MAE	4.64	Ours DLDL (KL)
3D	AFLW	MAE	9.78	DLDL (KL)
3D	BJUT-3D	MAE	0.09	Ours DLDL (KL)
3D	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
3D	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
3D	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
3D Face Modelling	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
3D Face Modelling	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
3D Face Modelling	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
3D Face Reconstruction	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
3D Face Reconstruction	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
3D Face Reconstruction	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
10-shot image generation	PASCAL VOC 2012	Mean IoU	67.1	DLDL-8s+CRF
10-shot image generation	PASCAL VOC 2011	Mean IoU	67.6	DLDL-8s+CRF
Age Estimation	ChaLearn 2015	MAE	3.51	DLDL+VGG-Face
Age Estimation	ChaLearn 2015	e-error	0.31	DLDL+VGG-Face
Age Estimation	MORPH Album2	MAE	2.42	DLDL+VGG-Face (KL, Max)3
1 Image, 2*2 Stitchi	Pointing'04	MAE	4.64	Ours DLDL (KL)
1 Image, 2*2 Stitchi	AFLW	MAE	9.78	DLDL (KL)
1 Image, 2*2 Stitchi	BJUT-3D	MAE	0.09	Ours DLDL (KL)

Deep Label Distribution Learning with Label Ambiguity

Abstract

Results

Related Papers

Deep Label Distribution Learning with Label Ambiguity

Abstract

Results

Related Papers