TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DIN-SQL: Decomposed In-Context Learning of Text-to-SQL wit...

DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction

Mohammadreza Pourreza, Davood Rafiei

2023-04-21NeurIPS 2023 11Text-To-SQL
PaperPDFCode(official)

Abstract

There is currently a significant gap between the performance of fine-tuned models and prompting approaches using Large Language Models (LLMs) on the challenging task of text-to-SQL, as evaluated on datasets such as Spider. To improve the performance of LLMs in the reasoning process, we study how decomposing the task into smaller sub-tasks can be effective. In particular, we show that breaking down the generation problem into sub-problems and feeding the solutions of those sub-problems into LLMs can be an effective approach for significantly improving their performance. Our experiments with three LLMs show that this approach consistently improves their simple few-shot performance by roughly 10%, pushing the accuracy of LLMs towards SOTA or surpassing it. On the holdout test set of Spider, the SOTA, in terms of execution accuracy, was 79.9 and the new SOTA at the time of this writing using our approach is 85.3. Our approach with in-context learning beats many heavily fine-tuned models by at least 5%. Additionally, when evaluated on the BIRD benchmark, our approach achieved an execution accuracy of 55.9%, setting a new SOTA on its holdout test set.

Results

TaskDatasetMetricValueModel
Semantic ParsingspiderExact Match Accuracy (Test)60DIN-SQL + GPT-4
Semantic ParsingspiderExecution Accuracy (Test)85.3DIN-SQL + GPT-4
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)50.72DIN-SQL + GPT-4
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)55.9DIN-SQL + GPT-4
Text-To-SQLspiderExact Match Accuracy (Test)60DIN-SQL + GPT-4
Text-To-SQLspiderExecution Accuracy (Test)85.3DIN-SQL + GPT-4
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)50.72DIN-SQL + GPT-4
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)55.9DIN-SQL + GPT-4

Related Papers

CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL2025-07-07SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications2025-06-23Schema-R1: A reasoning training approach for schema linking in Text-to-SQL Task2025-06-13Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA2025-06-11LLM-Driven Data Generation and a Novel Soft Metric for Evaluating Text-to-SQL in Aviation MRO2025-06-11HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration2025-06-11SEED: Enhancing Text-to-SQL Performance and Practical Usability Through Automatic Evidence Generation2025-06-09