Multi-Label Image Recognition with Graph Convolutional Networks

Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo

2019-04-07CVPR 2019 6Long-tail Learning Word Embeddings Multi-Label Classification

Abstract

The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore, we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.

Results

Task	Dataset	Metric	Value	Model
Multi-Label Classification	PASCAL VOC 2007	mAP	94	ML-GCN (pretrain from ImageNet)
Image Classification	COCO-MLT	Average mAP	44.24	ML-GCN(ResNet-50)
Image Classification	VOC-MLT	Average mAP	68.92	ML-GCN(ResNet-50)
Few-Shot Image Classification	COCO-MLT	Average mAP	44.24	ML-GCN(ResNet-50)
Few-Shot Image Classification	VOC-MLT	Average mAP	68.92	ML-GCN(ResNet-50)
Generalized Few-Shot Classification	COCO-MLT	Average mAP	44.24	ML-GCN(ResNet-50)
Generalized Few-Shot Classification	VOC-MLT	Average mAP	68.92	ML-GCN(ResNet-50)
Long-tail Learning	COCO-MLT	Average mAP	44.24	ML-GCN(ResNet-50)
Long-tail Learning	VOC-MLT	Average mAP	68.92	ML-GCN(ResNet-50)
Generalized Few-Shot Learning	COCO-MLT	Average mAP	44.24	ML-GCN(ResNet-50)
Generalized Few-Shot Learning	VOC-MLT	Average mAP	68.92	ML-GCN(ResNet-50)

Multi-Label Image Recognition with Graph Convolutional Networks

Abstract

Results

Related Papers

Multi-Label Image Recognition with Graph Convolutional Networks

Abstract

Results

Related Papers