TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/The Death of Schema Linking? Text-to-SQL in the Age of Wel...

The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Karime Maamari, Fadhil Abubaker, Daniel Jaroslawicz, Amine Mhedhbi

2024-08-14Text-To-SQLNatural Language Queries
PaperPDF

Abstract

Schema linking is a crucial step in Text-to-SQL pipelines. Its goal is to retrieve the relevant tables and columns of a target database for a user's query while disregarding irrelevant ones. However, imperfect schema linking can often exclude required columns needed for accurate query generation. In this work, we revisit schema linking when using the latest generation of large language models (LLMs). We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones. As such, our Text-to-SQL pipeline entirely forgoes schema linking in cases where the schema fits within the model's context window in order to minimize issues due to filtering required schema elements. Furthermore, instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-SQL pipeline. Our approach ranks first on the BIRD benchmark achieving an accuracy of 71.83%.

Results

TaskDatasetMetricValueModel
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)67.21Distillery + GPT-4o
Semantic ParsingBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)71.83Distillery + GPT-4o
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Dev)67.21Distillery + GPT-4o
Text-To-SQLBIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)Execution Accuracy % (Test)71.83Distillery + GPT-4o

Related Papers

CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL2025-07-07SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding2025-06-27A Modular Multitask Reasoning Framework Integrating Spatio-temporal Models and LLMs2025-06-25Towards Probabilistic Question Answering Over Tabular Data2025-06-25SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications2025-06-23Schema-R1: A reasoning training approach for schema linking in Text-to-SQL Task2025-06-13Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation2025-06-12