TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LongT5: Efficient Text-To-Text Transformer for Long Sequen...

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang

2021-12-15Findings (NAACL) 2022 7Question AnsweringMulti-Document SummarizationAbstractive Text SummarizationText SummarizationLong-range modeling
PaperPDFCodeCode(official)CodeCode

Abstract

Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models. In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and model size at the same time. Specifically, we integrated attention ideas from long-input transformers (ETC), and adopted pre-training strategies from summarization pre-training (PEGASUS) into the scalable T5 architecture. The result is a new attention mechanism we call {\em Transient Global} (TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. We are able to achieve state-of-the-art results on several summarization tasks and outperform the original T5 models on question answering tasks.

Results

TaskDatasetMetricValueModel
Text GenerationMulti-NewsROUGE-148.17LongT5
Text GenerationMulti-NewsROUGE-219.43LongT5
Text GenerationMulti-NewsROUGE-SU424.94LongT5
Language ModellingSCROLLSAvg.42.53LongT5 XL
Language ModellingSCROLLSCNLI88.2LongT5 XL
Language ModellingSCROLLSNrtv29.3LongT5 XL
Language ModellingSCROLLSQspr53.1LongT5 XL
Language ModellingSCROLLSAvg.41.03LongT5 Large
Language ModellingSCROLLSCNLI87.3LongT5 Large
Language ModellingSCROLLSNrtv27.2LongT5 Large
Language ModellingSCROLLSQspr52.3LongT5 Large
Language ModellingSCROLLSAvg.38.6LongT5 Base
Language ModellingSCROLLSCNLI85.6LongT5 Base
Language ModellingSCROLLSNrtv23LongT5 Base
Language ModellingSCROLLSQspr46.6LongT5 Base
Text SummarizationBigPatentROUGE-176.87LongT5
Text SummarizationBigPatentROUGE-266.06LongT5
Text SummarizationBigPatentROUGE-L70.76LongT5
Text SummarizationArxiv HEP-TH citation graphROUGE-148.35LongT5
Text SummarizationArxiv HEP-TH citation graphROUGE-221.92LongT5
Text SummarizationArxiv HEP-TH citation graphROUGE-L44.27LongT5
Text SummarizationPubmedROUGE-150.23LongT5
Text SummarizationPubmedROUGE-224.76LongT5
Text SummarizationPubmedROUGE-L46.67LongT5
Text SummarizationCNN / Daily MailROUGE-143.94LongT5
Text SummarizationCNN / Daily MailROUGE-221.4LongT5
Text SummarizationCNN / Daily MailROUGE-L41.28LongT5
Text SummarizationMulti-NewsROUGE-148.17LongT5
Text SummarizationMulti-NewsROUGE-219.43LongT5
Text SummarizationMulti-NewsROUGE-SU424.94LongT5
Abstractive Text SummarizationCNN / Daily MailROUGE-143.94LongT5
Abstractive Text SummarizationCNN / Daily MailROUGE-221.4LongT5
Abstractive Text SummarizationCNN / Daily MailROUGE-L41.28LongT5

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV2025-07-15