Reasoning Visual Dialogs with Structural and Partial Observations

Zilong Zheng, Wenguan Wang, Siyuan Qi, Song-Chun Zhu

2019-04-11CVPR 2019 6Visual Dialog

Abstract

We propose a novel model to address the task of Visual Dialog which exhibits complex dialog structures. To obtain a reasonable answer based on the current question and the dialog history, the underlying semantic dependencies between dialog entities are essential. In this paper, we explicitly formalize this task as inference in a graphical model with partially observed nodes and unknown graph structures (relations in dialog). The given dialog entities are viewed as the observed nodes. The answer to a given question is represented by a node with missing value. We first introduce an Expectation Maximization algorithm to infer both the underlying dialog structures and the missing node values (desired answers). Based on this, we proceed to propose a differentiable graph neural network (GNN) solution that approximates this process. Experiment results on the VisDial and VisDial-Q datasets show that our model outperforms comparative methods. It is also observed that our method can infer the underlying dialog structure for better dialog reasoning.

Results

Task	Dataset	Metric	Value	Model
Dialogue	VisDial v0.9 val	MRR	0.6285	GNN
Dialogue	VisDial v0.9 val	Mean Rank	4.57	GNN
Dialogue	VisDial v0.9 val	R@1	48.95	GNN
Dialogue	VisDial v0.9 val	R@10	88.36	GNN
Dialogue	VisDial v0.9 val	R@5	79.65	GNN
Dialogue	Visual Dialog v1.0 test-std	MRR (x 100)	61.37	GNN
Dialogue	Visual Dialog v1.0 test-std	Mean	4.57	GNN
Dialogue	Visual Dialog v1.0 test-std	NDCG (x 100)	52.82	GNN
Dialogue	Visual Dialog v1.0 test-std	R@1	47.33	GNN
Dialogue	Visual Dialog v1.0 test-std	R@10	87.83	GNN
Dialogue	Visual Dialog v1.0 test-std	R@5	77.98	GNN
Visual Dialog	VisDial v0.9 val	MRR	0.6285	GNN
Visual Dialog	VisDial v0.9 val	Mean Rank	4.57	GNN
Visual Dialog	VisDial v0.9 val	R@1	48.95	GNN
Visual Dialog	VisDial v0.9 val	R@10	88.36	GNN
Visual Dialog	VisDial v0.9 val	R@5	79.65	GNN
Visual Dialog	Visual Dialog v1.0 test-std	MRR (x 100)	61.37	GNN
Visual Dialog	Visual Dialog v1.0 test-std	Mean	4.57	GNN
Visual Dialog	Visual Dialog v1.0 test-std	NDCG (x 100)	52.82	GNN
Visual Dialog	Visual Dialog v1.0 test-std	R@1	47.33	GNN
Visual Dialog	Visual Dialog v1.0 test-std	R@10	87.83	GNN
Visual Dialog	Visual Dialog v1.0 test-std	R@5	77.98	GNN

Reasoning Visual Dialogs with Structural and Partial Observations

Abstract

Results

Related Papers

Reasoning Visual Dialogs with Structural and Partial Observations

Abstract

Results

Related Papers