TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Knowledge-to-SQL: Enhancing SQL Generation with Data Exper...

Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM

Zijin Hong, Zheng Yuan, Hao Chen, Qinggang Zhang, Feiran Huang, Xiao Huang

2024-02-18Text-To-SQL
PaperPDFCode(official)

Abstract

Generating accurate SQL queries for user questions (text-to-SQL) has been a long-standing challenge since it requires a deep understanding of both the user's question and the corresponding database schema in order to retrieve the desired content accurately. Existing methods rely on the comprehensive capability of large language models (LLMs) to generate the SQL. However, some necessary knowledge is not explicitly included in the database schema and user question or has been learned by LLMs. Thus, the generated SQL of the knowledge-insufficient questions may be inaccurate, negatively influencing the text-to-SQL models' performance and robustness. To address this challenge, we propose the Knowledge-to-SQL framework, which employs tailored Data Expert LLM (DELLM) to provide helpful knowledge for all text-to-SQL models. Specifically, we introduce the detailed implementation of DELLM regarding table reading and the basic fine-tuning process. We further propose a Preference Learning via Database Feedback (PLDBF) strategy, refining the DELLM to generate more helpful knowledge for LLMs. Extensive experiments verify that DELLM can enhance the state-of-the-art approaches for text-to-SQL tasks. The corresponding code of DELLM is released for further research.

Results

TaskDatasetMetricValueModel
Semantic ParsingspiderExecution Accuracy (Dev)71.68DELLM + GPT-4
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)48.92DELLM + MAC-SQL
Text-To-SQLspiderExecution Accuracy (Dev)71.68DELLM + GPT-4
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)48.92DELLM + MAC-SQL

Related Papers

CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL2025-07-07SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications2025-06-23Schema-R1: A reasoning training approach for schema linking in Text-to-SQL Task2025-06-13Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA2025-06-11LLM-Driven Data Generation and a Novel Soft Metric for Evaluating Text-to-SQL in Aviation MRO2025-06-11HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration2025-06-11SEED: Enhancing Text-to-SQL Performance and Practical Usability Through Automatic Evidence Generation2025-06-09