Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Qiuyuan Huang, Zhe Gan, Asli Celikyilmaz, Dapeng Wu, Jian-Feng Wang, Xiaodong He

2018-05-21Reinforcement Learning Story Generation Visual Storytelling reinforcement-learning

Abstract

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.

Results

Task	Dataset	Metric	Value	Model
Text Generation	VIST	BLEU-4	12.32	HSRL w/ Joint Training
Text Generation	VIST	CIDEr	10.71	HSRL w/ Joint Training
Text Generation	VIST	METEOR	35.23	HSRL w/ Joint Training
Text Generation	VIST	ROUGE-L	30.84	HSRL w/ Joint Training
Text Generation	VIST	SPICE	12.97	HSRL w/ Joint Training
Data-to-Text Generation	VIST	BLEU-4	12.32	HSRL w/ Joint Training
Data-to-Text Generation	VIST	CIDEr	10.71	HSRL w/ Joint Training
Data-to-Text Generation	VIST	METEOR	35.23	HSRL w/ Joint Training
Data-to-Text Generation	VIST	ROUGE-L	30.84	HSRL w/ Joint Training
Data-to-Text Generation	VIST	SPICE	12.97	HSRL w/ Joint Training
Visual Storytelling	VIST	BLEU-4	12.32	HSRL w/ Joint Training
Visual Storytelling	VIST	CIDEr	10.71	HSRL w/ Joint Training
Visual Storytelling	VIST	METEOR	35.23	HSRL w/ Joint Training
Visual Storytelling	VIST	ROUGE-L	30.84	HSRL w/ Joint Training
Visual Storytelling	VIST	SPICE	12.97	HSRL w/ Joint Training
Story Generation	VIST	BLEU-4	12.32	HSRL w/ Joint Training
Story Generation	VIST	CIDEr	10.71	HSRL w/ Joint Training
Story Generation	VIST	METEOR	35.23	HSRL w/ Joint Training
Story Generation	VIST	ROUGE-L	30.84	HSRL w/ Joint Training
Story Generation	VIST	SPICE	12.97	HSRL w/ Joint Training

Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Abstract

Results

Related Papers

Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Abstract

Results

Related Papers