TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Transformer-based Dual Relation Graph for Multi-label Imag...

Transformer-based Dual Relation Graph for Multi-label Image Recognition

Jiawei Zhao, Ke Yan, Yifan Zhao, Xiaowei Guo, Feiyue Huang, Jia Li

2021-10-10ICCV 2021 10Multi-Label Classification
PaperPDFCode(official)

Abstract

The simultaneous recognition of multiple objects in one image remains a challenging task, spanning multiple events in the recognition field such as various object scales, inconsistent appearances, and confused inter-class relationships. Recent research efforts mainly resort to the statistic label co-occurrences and linguistic word embedding to enhance the unclear semantics. Different from these researches, in this paper, we propose a novel Transformer-based Dual Relation learning framework, constructing complementary relationships by exploring two aspects of correlation, i.e., structural relation graph and semantic relation graph. The structural relation graph aims to capture long-range correlations from object context, by developing a cross-scale transformer-based architecture. The semantic graph dynamically models the semantic meanings of image objects with explicit semantic-aware constraints. In addition, we also incorporate the learnt structural relationship into the semantic graph, constructing a joint relation graph for robust representations. With the collaborative learning of these two effective relation graphs, our approach achieves new state-of-the-art on two popular multi-label recognition benchmarks, i.e., MS-COCO and VOC 2007 dataset.

Results

TaskDatasetMetricValueModel
Multi-Label ClassificationMS-COCOmAP86TDRG-R101(576×576)
Multi-Label ClassificationMS-COCOmAP84.6TDRG-R101(448×448)
Multi-Label ClassificationPASCAL VOC 2007mAP95TDRG-R101(448×448)

Related Papers

Privacy-Preserving Chest X-ray Classification in Latent Space with Homomorphically Encrypted Neural Inference2025-06-18Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation2025-06-17AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials2025-06-13MUDAS: Mote-scale Unsupervised Domain Adaptation in Multi-label Sound Classification2025-06-12ToxSyn-PT: A Large-Scale Synthetic Dataset for Hate Speech Detection in Portuguese2025-06-11Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis2025-06-05PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches2025-05-30Efficient Text Encoders for Labor Market Analysis2025-05-30