TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Improving Conditioning in Context-Aware Sequence to Sequen...

Improving Conditioning in Context-Aware Sequence to Sequence Models

Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite

2019-11-21Question AnsweringData AugmentationTranslationOpen-Domain Question Answering
PaperPDF

Abstract

Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on cases where generation is conditioned on both a short query and a long context, such as abstractive question answering or document-level translation. We modify the standard sequence-to-sequence approach to make better use of both the query and the context by expanding the conditioning mechanism to intertwine query and context attention. We also introduce a simple and efficient data augmentation method for the proposed model. Experiments on three different tasks show that both changes lead to consistent improvements.

Results

TaskDatasetMetricValueModel
Question AnsweringELI5Rouge-123.32Multi-Inrerleave
Question AnsweringELI5Rouge-24.79Multi-Inrerleave
Question AnsweringELI5Rouge-L14.63Multi-Inrerleave
Open-Domain Question AnsweringELI5Rouge-123.32Multi-Inrerleave
Open-Domain Question AnsweringELI5Rouge-24.79Multi-Inrerleave
Open-Domain Question AnsweringELI5Rouge-L14.63Multi-Inrerleave

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16