Extractive Summarization of Long Documents by Combining Global and Local Context

Wen Xiao, Giuseppe Carenini

2019-09-17IJCNLP 2019 11Text Summarization Extractive Summarization

Abstract

In this paper, we propose a novel neural single document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers, Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1, ROUGE-2 and METEOR scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Rather surprisingly, an ablation study indicates that the benefits of our model seem to come exclusively from modeling the local context, even for the longest documents.

Results

Task	Dataset	Metric	Value	Model
Text Summarization	Arxiv HEP-TH citation graph	ROUGE-1	43.58	ExtSum-LG
Text Summarization	Arxiv HEP-TH citation graph	ROUGE-2	17.37	ExtSum-LG
Text Summarization	Pubmed	ROUGE-1	44.81	ExtSum-LG
Text Summarization	Pubmed	ROUGE-2	19.74	ExtSum-LG

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15 On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention2025-06-11 Improving large language models with concept-aware fine-tuning2025-06-09 MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection2025-05-29 StrucSum: Graph-Structured Reasoning for Long Document Extractive Summarization with LLMs2025-05-29 APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization2025-05-26 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)2025-05-25 Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning2025-05-23