TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storyt...

BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling

Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou

2020-12-03Visual Storytelling
PaperPDF

Abstract

Visual storytelling is a creative and challenging task, aiming to automatically generate a story-like description for a sequence of images. The descriptions generated by previous visual storytelling approaches lack coherence because they use word-level sequence generation methods and do not adequately consider sentence-level dependencies. To tackle this problem, we propose a novel hierarchical visual storytelling framework which separately models sentence-level and word-level semantics. We use the transformer-based BERT to obtain embeddings for sentences and words. We then employ a hierarchical LSTM network: the bottom LSTM receives as input the sentence vector representation from BERT, to learn the dependencies between the sentences corresponding to images, and the top LSTM is responsible for generating the corresponding word vector representations, taking input from the bottom LSTM. Experimental results demonstrate that our model outperforms most closely related baselines under automatic evaluation metrics BLEU and CIDEr, and also show the effectiveness of our method with human evaluation.

Results

TaskDatasetMetricValueModel
Text GenerationVISTCIDEr8.37BERT-hLSTMs
Text GenerationVISTCIDEr7.98hLSTMs
Data-to-Text GenerationVISTCIDEr8.37BERT-hLSTMs
Data-to-Text GenerationVISTCIDEr7.98hLSTMs
Visual StorytellingVISTCIDEr8.37BERT-hLSTMs
Visual StorytellingVISTCIDEr7.98hLSTMs
Story GenerationVISTCIDEr8.37BERT-hLSTMs
Story GenerationVISTCIDEr7.98hLSTMs

Related Papers

Shape2Animal: Creative Animal Generation from Natural Silhouettes2025-06-25JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent2025-06-21Consistent Story Generation with Asymmetry Zigzag Sampling2025-06-11Camera Trajectory Generation: A Comprehensive Survey of Methods, Metrics, and Future Directions2025-06-01LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers2025-05-29Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts2025-05-22StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation2025-05-15VIST-GPT: Ushering in the Era of Visual Storytelling with LLMs?2025-04-27