TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/NexusSum: Hierarchical LLM Agents for Long-Form Narrative ...

NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization

Hyuntak Kim, Byung-Hak Kim

2025-05-30Long-Form Narrative SummarizationDescriptiveForm
PaperPDF

Abstract

Summarizing long-form narratives--such as books, movies, and TV scripts--requires capturing intricate plotlines, character interactions, and thematic coherence, a task that remains challenging for existing LLMs. We introduce NexusSum, a multi-agent LLM framework for narrative summarization that processes long-form text through a structured, sequential pipeline--without requiring fine-tuning. Our approach introduces two key innovations: (1) Dialogue-to-Description Transformation: A narrative-specific preprocessing method that standardizes character dialogue and descriptive text into a unified format, improving coherence. (2) Hierarchical Multi-LLM Summarization: A structured summarization pipeline that optimizes chunk processing and controls output length for accurate, high-quality summaries. Our method establishes a new state-of-the-art in narrative summarization, achieving up to a 30.0% improvement in BERTScore (F1) across books, movies, and TV scripts. These results demonstrate the effectiveness of multi-agent LLMs in handling long-form content, offering a scalable approach for structured summarization in diverse storytelling domains.

Results

TaskDatasetMetricValueModel
Text SummarizationBookSumBERTScore (F1)70.7NexusSum (Mistral Large)
Text SummarizationBookSumROUGE (geometric mean of 1/2/L)18.27NexusSum (Mistral Large)
Text SummarizationBookSumROUGE-142.51NexusSum (Mistral Large)
Text SummarizationBookSumROUGE-210.27NexusSum (Mistral Large)
Text SummarizationBookSumROUGE-L23.91NexusSum (Mistral Large)
Text SummarizationBookSumBERTScore (F1)46.42Zero-Shot (Mistral Large)
Text SummarizationBookSumROUGE-119.63Zero-Shot (Mistral Large)
Text SummarizationBookSumROUGE-22.99Zero-Shot (Mistral Large)
Text SummarizationBookSumROUGE-L12Zero-Shot (Mistral Large)
Text SummarizationBookSumROUGE (geometric mean of 1/2/L)16.46NexusSum (Claude 3 Haiku)
Text SummarizationSummScreenBERTScore (F1)61.59NexusSum (Mistral Large)
Text SummarizationSummScreenROUGE-130.44NexusSum (Mistral Large)
Text SummarizationSummScreenROUGE-26.4NexusSum (Mistral Large)
Text SummarizationSummScreenROUGE-L17.95NexusSum (Mistral Large)
Text SummarizationSummScreenBERTScore (F1)57.23Zero-Shot (Mistral Large)
Text SummarizationSummScreenROUGE-129.18Zero-Shot (Mistral Large)
Text SummarizationSummScreenROUGE-27.43Zero-Shot (Mistral Large)
Text SummarizationSummScreenROUGE-L19.06Zero-Shot (Mistral Large)
Text SummarizationMENSABERTScore (F1)65.73NexusSum (Mistral Large)
Text SummarizationMENSAROUGE-144.91NexusSum (Mistral Large)
Text SummarizationMENSAROUGE-211.43NexusSum (Mistral Large)
Text SummarizationMENSAROUGE-L19.23NexusSum (Mistral Large)
Text SummarizationMENSABERTScore (F1)54.8Zero-Shot (Mistral Large)
Text SummarizationMENSAROUGE-137.43Zero-Shot (Mistral Large)
Text SummarizationMENSAROUGE-210.52Zero-Shot (Mistral Large)
Text SummarizationMENSAROUGE-L21.52Zero-Shot (Mistral Large)
Text SummarizationMovieSumBERTScore (F1)63.53NexusSum (Mistral Large)
Text SummarizationMovieSumROUGE-144.91NexusSum (Mistral Large)
Text SummarizationMovieSumROUGE-211.43NexusSum (Mistral Large)
Text SummarizationMovieSumROUGE-L19.23NexusSum (Mistral Large)
Text SummarizationMovieSumBERTScore (F1)55.5Zero-Shot (Mistral Large)
Text SummarizationMovieSumROUGE-139.22Zero-Shot (Mistral Large)
Text SummarizationMovieSumROUGE-210.53Zero-Shot (Mistral Large)
Text SummarizationMovieSumROUGE-L22.55Zero-Shot (Mistral Large)

Related Papers

DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16FreeAudio: Training-Free Timing Planning for Controllable Long-Form Text-to-Audio Generation2025-07-11FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation2025-07-09Beyond Accuracy: Metrics that Uncover What Makes a 'Good' Visual Descriptor2025-07-04Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization2025-07-03Dataset Distillation via Vision-Language Category Prototype2025-06-30