TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Commonsense Knowledge Aware Concept Selection For Diverse ...

Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling

Hong Chen, Yifei HUANG, Hiroya Takamura, Hideki Nakayama

2021-02-05InformativenessVisual Storytelling
PaperPDF

Abstract

Visual storytelling is a task of generating relevant and interesting stories for given image sequences. In this work we aim at increasing the diversity of the generated stories while preserving the informative content from the images. We propose to foster the diversity and informativeness of a generated story by using a concept selection module that suggests a set of concept candidates. Then, we utilize a large scale pre-trained model to convert concepts and images into full stories. To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed. To obtain appropriate concepts from the graph, we propose two novel modules that consider the correlation among candidate concepts and the image-concept correlation. Extensive automatic and human evaluation results demonstrate that our model can produce reasonable concepts. This enables our model to outperform the previous models by a large margin on the diversity and informativeness of the story, while retaining the relevance of the story to the image sequence.

Results

TaskDatasetMetricValueModel
Text GenerationVISTBLEU-323.1MCSM+RNN
Text GenerationVISTBLEU-413MCSM+RNN
Text GenerationVISTCIDEr11MCSM+RNN
Text GenerationVISTMETEOR36.1MCSM+RNN
Text GenerationVISTROUGE-L30.7MCSM+RNN
Data-to-Text GenerationVISTBLEU-323.1MCSM+RNN
Data-to-Text GenerationVISTBLEU-413MCSM+RNN
Data-to-Text GenerationVISTCIDEr11MCSM+RNN
Data-to-Text GenerationVISTMETEOR36.1MCSM+RNN
Data-to-Text GenerationVISTROUGE-L30.7MCSM+RNN
Visual StorytellingVISTBLEU-323.1MCSM+RNN
Visual StorytellingVISTBLEU-413MCSM+RNN
Visual StorytellingVISTCIDEr11MCSM+RNN
Visual StorytellingVISTMETEOR36.1MCSM+RNN
Visual StorytellingVISTROUGE-L30.7MCSM+RNN
Story GenerationVISTBLEU-323.1MCSM+RNN
Story GenerationVISTBLEU-413MCSM+RNN
Story GenerationVISTCIDEr11MCSM+RNN
Story GenerationVISTMETEOR36.1MCSM+RNN
Story GenerationVISTROUGE-L30.7MCSM+RNN

Related Papers

Multi-Agent Retrieval-Augmented Framework for Evidence-Based Counterspeech Against Health Misinformation2025-07-09LumiCRS: Asymmetric Contrastive Prototype Learning for Long-Tail Conversational Movie Recommendation2025-07-07Dynamic Bandwidth Allocation for Hybrid Event-RGB Transmission2025-06-25Shape2Animal: Creative Animal Generation from Natural Silhouettes2025-06-25Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment2025-06-24JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent2025-06-21Consistent Story Generation with Asymmetry Zigzag Sampling2025-06-11CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems2025-06-09