TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CHESS: Contextual Harnessing for Efficient SQL Synthesis

CHESS: Contextual Harnessing for Efficient SQL Synthesis

Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, Amin Saberi

2024-05-27Text-To-SQLLarge Language ModelPrivacy PreservingRetrieval
PaperPDFCode(official)Code

Abstract

Translating natural language questions into SQL queries, known as text-to-SQL, is a long-standing research problem. Effective text-to-SQL synthesis can become very challenging due to (i) the extensive size of database catalogs (descriptions of tables and their columns) and database values, (ii) reasoning over large database schemas, (iii) ensuring the functional validity of the generated queries, and (iv) navigating the ambiguities of natural language questions. We introduce CHESS, a Large Language Model (LLM) based multi-agent framework for efficient and scalable SQL synthesis, comprising four specialized agents, each targeting one of the aforementioned challenges: the Information Retriever (IR) extracts relevant data, the Schema Selector (SS) prunes large schemas, the Candidate Generator (CG) generates high-quality candidates and refines queries iteratively, and the Unit Tester (UT) validates queries through LLM-based natural language unit tests. Our framework offers configurable features that adapt to various deployment constraints, including 1) Supporting industrial-scale databases: leveraging the Schema Selector agent, CHESS efficiently narrows down very large database schemas into manageable sub-schemas, boosting system accuracy by approximately $2\%$ and reducing the number of LLM tokens by $\times 5$. 2) State-of-the-Art privacy-preserving performance: Among the methods using open-source models, CHESS achieves state-of-the-art performance, resulting in a high-performing, privacy-preserving system suitable for industrial deployment. 3) Scalablity with additional compute budget: In settings with high computational budgets, CHESS achieves $71.10\%$ accuracy on the BIRD test set, within $2\%$ of the leading proprietary method, while requiring approximately $83\%$ fewer LLM calls.

Results

TaskDatasetMetricValueModel
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)65CHESS
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)66.69CHESS
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)65CHESS
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)66.69CHESS

Related Papers

DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Federated Learning for Commercial Image Sources2025-07-17Transformer-Based Person Identification via Wi-Fi CSI Amplitude and Phase Perturbations2025-07-17