TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BRIO: Bringing Order to Abstractive Summarization

BRIO: Bringing Order to Abstractive Summarization

Yixin Liu, PengFei Liu, Dragomir Radev, Graham Neubig

2022-03-31ACL 2022 5Abstractive Text SummarizationText Summarization
PaperPDFCodeCode(official)Code

Abstract

Abstractive summarization models are commonly trained using maximum likelihood estimation, which assumes a deterministic (one-point) target distribution in which an ideal model will assign all the probability mass to the reference summary. This assumption may lead to performance degradation during inference, where the model needs to compare several system-generated (candidate) summaries that have deviated from the reference summary. To address this problem, we propose a novel training paradigm which assumes a non-deterministic distribution so that different candidate summaries are assigned probability mass according to their quality. Our method achieves a new state-of-the-art result on the CNN/DailyMail (47.78 ROUGE-1) and XSum (49.07 ROUGE-1) datasets. Further analysis also shows that our model can estimate probabilities of candidate summaries that are more correlated with their level of quality.

Results

TaskDatasetMetricValueModel
Text SummarizationX-SumROUGE-149.07BRIO
Text SummarizationX-SumROUGE-225.59BRIO
Text SummarizationX-SumROUGE-340.4BRIO
Text SummarizationCNN / Daily MailROUGE-147.78BRIO
Text SummarizationCNN / Daily MailROUGE-223.55BRIO
Text SummarizationCNN / Daily MailROUGE-L44.57BRIO
Abstractive Text SummarizationCNN / Daily MailROUGE-147.78BRIO
Abstractive Text SummarizationCNN / Daily MailROUGE-223.55BRIO
Abstractive Text SummarizationCNN / Daily MailROUGE-L44.57BRIO

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention2025-06-11Improving large language models with concept-aware fine-tuning2025-06-09Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs2025-06-03ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs2025-05-29MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection2025-05-29APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization2025-05-26FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)2025-05-25