TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CoSQL: A Conversational Text-to-SQL Challenge Towards Cros...

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter S. Lasecki, Dragomir Radev

2019-09-11IJCNLP 2019 11Text-To-SQLDialogue State TrackingResponse Generation
PaperPDFCodeCodeCode(official)

Abstract

We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets:(1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at https://yale-lily.github.io/cosql.

Results

TaskDatasetMetricValueModel
DialogueCoSQLinteraction match accuracy2.2SyntaxSQL-con
DialogueCoSQLquestion match accuracy14.1SyntaxSQL-con
DialogueCoSQLinteraction match accuracy2.6CD-Seq2seq
DialogueCoSQLquestion match accuracy13.9CD-Seq2seq

Related Papers

CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL2025-07-07Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky2025-07-04Knowledge Augmented Finetuning Matters in both RAG and Agent Based Dialog Systems2025-06-28SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications2025-06-23SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification2025-06-20From What to Respond to When to Respond: Timely Response Generation for Open-domain Dialogue Agents2025-06-17Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation2025-06-14