Detecting Visual Relationships with Deep Relational Networks

Bo Dai, Yuqi Zhang, Dahua Lin

2017-04-11CVPR 2017 7General Classification

Abstract

Relationships among objects play a crucial role in image understanding. Despite the great success of deep learning techniques in recognizing individual objects, reasoning about the relationships among objects remains a challenging task. Previous methods often treat this as a classification problem, considering each type of relationship (e.g. "ride") or each distinct visual phrase (e.g. "person-ride-horse") as a category. Such approaches are faced with significant difficulties caused by the high diversity of visual appearance for each kind of relationships or the large number of distinct visual phrases. We propose an integrated framework to tackle this problem. At the heart of this framework is the Deep Relational Network, a novel formulation designed specifically for exploiting the statistical dependencies between objects and their relationships. On two large datasets, the proposed method achieves substantial improvement over state-of-the-art.

Results

Task	Dataset	Metric	Value	Model
Scene Parsing	VRD Relationship Detection	R@100	20.88	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Parsing	VRD Relationship Detection	R@50	17.73	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Parsing	VRD Predicate Detection	R@100	81.9	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Parsing	VRD Predicate Detection	R@50	80.78	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Parsing	VRD Phrase Detection	R@100	23.45	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Parsing	VRD Phrase Detection	R@50	19.93	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Relationship Detection	R@100	20.88	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Relationship Detection	R@50	17.73	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Predicate Detection	R@100	81.9	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Predicate Detection	R@50	80.78	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Phrase Detection	R@100	23.45	Dai et. al [[Dai, Zhang, and Lin2017]]
Visual Relationship Detection	VRD Phrase Detection	R@50	19.93	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Relationship Detection	R@100	20.88	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Relationship Detection	R@50	17.73	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Predicate Detection	R@100	81.9	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Predicate Detection	R@50	80.78	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Phrase Detection	R@100	23.45	Dai et. al [[Dai, Zhang, and Lin2017]]
Scene Understanding	VRD Phrase Detection	R@50	19.93	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Relationship Detection	R@100	20.88	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Relationship Detection	R@50	17.73	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Predicate Detection	R@100	81.9	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Predicate Detection	R@50	80.78	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Phrase Detection	R@100	23.45	Dai et. al [[Dai, Zhang, and Lin2017]]
2D Semantic Segmentation	VRD Phrase Detection	R@50	19.93	Dai et. al [[Dai, Zhang, and Lin2017]]

Detecting Visual Relationships with Deep Relational Networks

Abstract

Results

Related Papers

Detecting Visual Relationships with Deep Relational Networks

Abstract

Results

Related Papers