Hyuntak Kim, Byung-Hak Kim
Summarizing long-form narratives--such as books, movies, and TV scripts--requires capturing intricate plotlines, character interactions, and thematic coherence, a task that remains challenging for existing LLMs. We introduce NexusSum, a multi-agent LLM framework for narrative summarization that processes long-form text through a structured, sequential pipeline--without requiring fine-tuning. Our approach introduces two key innovations: (1) Dialogue-to-Description Transformation: A narrative-specific preprocessing method that standardizes character dialogue and descriptive text into a unified format, improving coherence. (2) Hierarchical Multi-LLM Summarization: A structured summarization pipeline that optimizes chunk processing and controls output length for accurate, high-quality summaries. Our method establishes a new state-of-the-art in narrative summarization, achieving up to a 30.0% improvement in BERTScore (F1) across books, movies, and TV scripts. These results demonstrate the effectiveness of multi-agent LLMs in handling long-form content, offering a scalable approach for structured summarization in diverse storytelling domains.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Text Summarization | BookSum | BERTScore (F1) | 70.7 | NexusSum (Mistral Large) |
| Text Summarization | BookSum | ROUGE (geometric mean of 1/2/L) | 18.27 | NexusSum (Mistral Large) |
| Text Summarization | BookSum | ROUGE-1 | 42.51 | NexusSum (Mistral Large) |
| Text Summarization | BookSum | ROUGE-2 | 10.27 | NexusSum (Mistral Large) |
| Text Summarization | BookSum | ROUGE-L | 23.91 | NexusSum (Mistral Large) |
| Text Summarization | BookSum | BERTScore (F1) | 46.42 | Zero-Shot (Mistral Large) |
| Text Summarization | BookSum | ROUGE-1 | 19.63 | Zero-Shot (Mistral Large) |
| Text Summarization | BookSum | ROUGE-2 | 2.99 | Zero-Shot (Mistral Large) |
| Text Summarization | BookSum | ROUGE-L | 12 | Zero-Shot (Mistral Large) |
| Text Summarization | BookSum | ROUGE (geometric mean of 1/2/L) | 16.46 | NexusSum (Claude 3 Haiku) |
| Text Summarization | SummScreen | BERTScore (F1) | 61.59 | NexusSum (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-1 | 30.44 | NexusSum (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-2 | 6.4 | NexusSum (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-L | 17.95 | NexusSum (Mistral Large) |
| Text Summarization | SummScreen | BERTScore (F1) | 57.23 | Zero-Shot (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-1 | 29.18 | Zero-Shot (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-2 | 7.43 | Zero-Shot (Mistral Large) |
| Text Summarization | SummScreen | ROUGE-L | 19.06 | Zero-Shot (Mistral Large) |
| Text Summarization | MENSA | BERTScore (F1) | 65.73 | NexusSum (Mistral Large) |
| Text Summarization | MENSA | ROUGE-1 | 44.91 | NexusSum (Mistral Large) |
| Text Summarization | MENSA | ROUGE-2 | 11.43 | NexusSum (Mistral Large) |
| Text Summarization | MENSA | ROUGE-L | 19.23 | NexusSum (Mistral Large) |
| Text Summarization | MENSA | BERTScore (F1) | 54.8 | Zero-Shot (Mistral Large) |
| Text Summarization | MENSA | ROUGE-1 | 37.43 | Zero-Shot (Mistral Large) |
| Text Summarization | MENSA | ROUGE-2 | 10.52 | Zero-Shot (Mistral Large) |
| Text Summarization | MENSA | ROUGE-L | 21.52 | Zero-Shot (Mistral Large) |
| Text Summarization | MovieSum | BERTScore (F1) | 63.53 | NexusSum (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-1 | 44.91 | NexusSum (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-2 | 11.43 | NexusSum (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-L | 19.23 | NexusSum (Mistral Large) |
| Text Summarization | MovieSum | BERTScore (F1) | 55.5 | Zero-Shot (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-1 | 39.22 | Zero-Shot (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-2 | 10.53 | Zero-Shot (Mistral Large) |
| Text Summarization | MovieSum | ROUGE-L | 22.55 | Zero-Shot (Mistral Large) |