TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multi-Label Image Recognition with Graph Convolutional Net...

Multi-Label Image Recognition with Graph Convolutional Networks

Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo

2019-04-07CVPR 2019 6Long-tail LearningWord EmbeddingsMulti-Label Classification
PaperPDFCodeCode

Abstract

The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore, we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.

Results

TaskDatasetMetricValueModel
Multi-Label ClassificationPASCAL VOC 2007mAP94ML-GCN (pretrain from ImageNet)
Image ClassificationCOCO-MLTAverage mAP44.24ML-GCN(ResNet-50)
Image ClassificationVOC-MLTAverage mAP68.92ML-GCN(ResNet-50)
Few-Shot Image ClassificationCOCO-MLTAverage mAP44.24ML-GCN(ResNet-50)
Few-Shot Image ClassificationVOC-MLTAverage mAP68.92ML-GCN(ResNet-50)
Generalized Few-Shot ClassificationCOCO-MLTAverage mAP44.24ML-GCN(ResNet-50)
Generalized Few-Shot ClassificationVOC-MLTAverage mAP68.92ML-GCN(ResNet-50)
Long-tail LearningCOCO-MLTAverage mAP44.24ML-GCN(ResNet-50)
Long-tail LearningVOC-MLTAverage mAP68.92ML-GCN(ResNet-50)
Generalized Few-Shot LearningCOCO-MLTAverage mAP44.24ML-GCN(ResNet-50)
Generalized Few-Shot LearningVOC-MLTAverage mAP68.92ML-GCN(ResNet-50)

Related Papers

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21Privacy-Preserving Chest X-ray Classification in Latent Space with Homomorphically Encrypted Neural Inference2025-06-18Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation2025-06-17Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings2025-06-16AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials2025-06-13