TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Semantic Parsing/BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)

Semantic Parsing on BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)

Metric: Execution Accuracy % (Test) (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Execution Accuracy % (Test)▼Extra DataPaperDate↕Code
1XiYan-SQL75.63NoA Preview of XiYan-SQL: A Multi-Generator Ensemb...2024-11-13Code
2DSAIR + GPT-4o74.12No---
3CHASE-SQL + Gemini74.06NoCHASE-SQL: Multi-Path Reasoning and Preference O...2024-10-02-
4ExSL + granite-34b-code73.17No---
5OpenSearch-SQL+ v2 + GPT-4o72.28No---
6Distillery + GPT-4o71.83NoThe Death of Schema Linking? Text-to-SQL in the ...2024-08-14-
7Insights AI70.26No---
8PURPLE + RED + GPT-4o70.21No---
9MCTS-SQL69.4No---
10RECAP + Gemini69.03No---
11ByteBrain68.87No---
12ExSL + granite-20b-code67.86No---
13CHESS66.69NoCHESS: Contextual Harnessing for Efficient SQL S...2024-05-27Code
14Arcwise + GPT-4o66.21No---
15MCS-SQL + GPT-465.45No---
16SCL-SQL65.23No---
17OpenSearch-SQL v1 + GPT-464.95No---
18PB-SQL v164.84No---
19PURPLE + GPT-4o64.51No---
20MSL-SQL + DeepSeek-V2.564No---
21SENSE-13B63.39No---
22SENSE63.39No---
23GRA-SQL63.22No---
24SuperSQL62.66No---
25Dubo-SQL, v160.71No---
26SFT CodeS-15B60.37No---
27MAC-SQL + GPT-459.59NoMAC-SQL: A Multi-Agent Collaborative Framework f...2023-12-18Code
28SFT CodeS-7B59.25No---
29DAIL-SQL + GPT-457.41NoText-to-SQL Empowered by Large Language Models: ...2023-08-29Code
30DIN-SQL + GPT-455.9NoDIN-SQL: Decomposed In-Context Learning of Text-...2023-04-21Code
31GPT-4 (Baseline)54.89NoCan LLMs Effectively Leverage Graph Structural I...2023-09-28Code
32Claude-2 (Baseline)49.02NoCan LLMs Effectively Leverage Graph Structural I...2023-09-28Code
33Open SQL-7B47.74No---
34CoT + ChatGPT40.08NoCan LLM Already Serve as A Database Interface? A...2023-05-04Code
35ChatGPT (Baseline)39.3NoCan LLM Already Serve as A Database Interface? A...2023-05-04Code
36Codex (Baseline)36.47NoCan LLM Already Serve as A Database Interface? A...2023-05-04Code
37Palm-2 (Baseline)33.04NoCan LLM Already Serve as A Database Interface? A...2023-05-04Code