TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Detecting Visual Relationships with Deep Relational Networks

Detecting Visual Relationships with Deep Relational Networks

Bo Dai, Yuqi Zhang, Dahua Lin

2017-04-11CVPR 2017 7General Classification
PaperPDFCode(official)

Abstract

Relationships among objects play a crucial role in image understanding. Despite the great success of deep learning techniques in recognizing individual objects, reasoning about the relationships among objects remains a challenging task. Previous methods often treat this as a classification problem, considering each type of relationship (e.g. "ride") or each distinct visual phrase (e.g. "person-ride-horse") as a category. Such approaches are faced with significant difficulties caused by the high diversity of visual appearance for each kind of relationships or the large number of distinct visual phrases. We propose an integrated framework to tackle this problem. At the heart of this framework is the Deep Relational Network, a novel formulation designed specifically for exploiting the statistical dependencies between objects and their relationships. On two large datasets, the proposed method achieves substantial improvement over state-of-the-art.

Results

TaskDatasetMetricValueModel
Scene ParsingVRD Relationship DetectionR@10020.88Dai et. al [[Dai, Zhang, and Lin2017]]
Scene ParsingVRD Relationship DetectionR@5017.73Dai et. al [[Dai, Zhang, and Lin2017]]
Scene ParsingVRD Predicate DetectionR@10081.9Dai et. al [[Dai, Zhang, and Lin2017]]
Scene ParsingVRD Predicate DetectionR@5080.78Dai et. al [[Dai, Zhang, and Lin2017]]
Scene ParsingVRD Phrase DetectionR@10023.45Dai et. al [[Dai, Zhang, and Lin2017]]
Scene ParsingVRD Phrase DetectionR@5019.93Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Relationship DetectionR@10020.88Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Relationship DetectionR@5017.73Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Predicate DetectionR@10081.9Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Predicate DetectionR@5080.78Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Phrase DetectionR@10023.45Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship DetectionVRD Phrase DetectionR@5019.93Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Relationship DetectionR@10020.88Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Relationship DetectionR@5017.73Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Predicate DetectionR@10081.9Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Predicate DetectionR@5080.78Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Phrase DetectionR@10023.45Dai et. al [[Dai, Zhang, and Lin2017]]
Scene UnderstandingVRD Phrase DetectionR@5019.93Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Relationship DetectionR@10020.88Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Relationship DetectionR@5017.73Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Predicate DetectionR@10081.9Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Predicate DetectionR@5080.78Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Phrase DetectionR@10023.45Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic SegmentationVRD Phrase DetectionR@5019.93Dai et. al [[Dai, Zhang, and Lin2017]]

Related Papers

Specialized text classification: an approach to classifying Open Banking transactions2025-04-10Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy2025-01-13Revisiting MLLMs: An In-Depth Analysis of Image Classification Abilities2024-12-21Using Instruction-Tuned Large Language Models to Identify Indicators of Vulnerability in Police Incident Narratives2024-12-16Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem2024-07-10Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition2024-04-30DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models2024-04-05Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency2024-02-24