Content Enhanced BERT-based Text-to-SQL Generation

Tong Guo, Huilin Gao

2019-10-16Semantic Parsing Text-To-SQL Code Generation

Paper PDF Code Code Code(official)Code(official)Code

Abstract

We present a simple methods to leverage the table content for the BERT-based model to solve the text-to-SQL problem. Based on the observation that some of the table content match some words in question string and some of the table header also match some words in question string, we encode two addition feature vector for the deep model. Our methods also benefit the model inference in testing time as the tables are almost the same in training and testing time. We test our model on the WikiSQL dataset and outperform the BERT-based baseline by 3.7% in logic form and 3.7% in execution accuracy and achieve state-of-the-art.

Results

Task	Dataset	Metric	Value	Model
Code Generation	WikiSQL	Exact Match Accuracy	83.7	NL2SQL-RULE
Code Generation	WikiSQL	Execution Accuracy	89.2	NL2SQL-RULE
Semantic Parsing	WikiSQL	Accuracy	89	NL2SQL-BERT

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18 Towards Formal Verification of LLM-Generated Code from Natural Language Prompts2025-07-17 MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks2025-07-16 Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training2025-07-16 The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15 Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding2025-07-14 CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks2025-07-14 CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance2025-07-14