TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CoQA: A Conversational Question Answering Challenge

CoQA: A Conversational Question Answering Challenge

Siva Reddy, Danqi Chen, Christopher D. Manning

2018-08-21TACL 2019 3Reading ComprehensionQuestion AnsweringConversational Question AnsweringGenerative Question Answering
PaperPDFCodeCodeCodeCode

Abstract

Humans gather information by engaging in conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets, e.g., coreference and pragmatic reasoning. We evaluate strong conversational and reading comprehension models on CoQA. The best system obtains an F1 score of 65.4%, which is 23.4 points behind human performance (88.8%), indicating there is ample room for improvement. We launch CoQA as a challenge to the community at http://stanfordnlp.github.io/coqa/

Results

TaskDatasetMetricValueModel
Question AnsweringCoQAIn-domain67DrQA + seq2seq with copy attention (single model)
Question AnsweringCoQAOut-of-domain60.4DrQA + seq2seq with copy attention (single model)
Question AnsweringCoQAOverall65.1DrQA + seq2seq with copy attention (single model)
Question AnsweringCoQAIn-domain54.5Vanilla DrQA (single model)
Question AnsweringCoQAOut-of-domain47.9Vanilla DrQA (single model)
Question AnsweringCoQAOverall52.6Vanilla DrQA (single model)
Question AnsweringCoQAF1-Score45.4PGNet

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16Warehouse Spatial Question Answering with LLM Agent2025-07-14Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09