TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Select and Summarize: Scene Saliency for Movie Script Summ...

Select and Summarize: Scene Saliency for Movie Script Summarization

Rohit Saxena, Frank Keller

2024-04-04Long-Form Narrative SummarizationAbstractive Text Summarization
PaperPDFCode(official)

Abstract

Abstractive summarization for long-form narrative texts such as movie scripts is challenging due to the computational and memory constraints of current language models. A movie script typically comprises a large number of scenes; however, only a fraction of these scenes are salient, i.e., important for understanding the overall narrative. The salience of a scene can be operationalized by considering it as salient if it is mentioned in the summary. Automatically identifying salient scenes is difficult due to the lack of suitable datasets. In this work, we introduce a scene saliency dataset that consists of human-annotated salient scenes for 100 movies. We propose a two-stage abstractive summarization approach which first identifies the salient scenes in script and then generates a summary using only those scenes. Using QA-based evaluation, we show that our model outperforms previous state-of-the-art summarization methods and reflects the information content of a movie more accurately than a model that takes the whole movie script as input.

Results

TaskDatasetMetricValueModel
Text SummarizationMENSABERTScore (F1)57.46SELECT & SUMM (LED)
Text SummarizationMENSABERTScore (F1)56.34Two-Stage Heuristic (LED Large)
Text SummarizationMENSABERTScore (F1)40.87SUMM-N Multi Stage

Related Papers

Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs2025-06-03NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization2025-05-30ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs2025-05-29Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality2025-05-22Enhancing Abstractive Summarization of Scientific Papers Using Structure Information2025-05-20Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline2025-05-16ProdRev: A DNN framework for empowering customers using generative pre-trained transformers2025-05-14A Split-then-Join Approach to Abstractive Summarization for Very Long Documents in a Low Resource Setting2025-05-11