KaggleDBQA

KaggleDBQA: Realistic Text-to-SQL dataset

TextsIntroduced 2021-06-22

KaggleDBQA is a challenging cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions.

It expands upon contemporary cross-domain text-to-SQL datasets in three key aspects: (1) Its databases are pulled from real-world data sources and not normalized. (2) Its questions are authored in environments that mimic natural question answering. (3) It also provides database documentation that contains rich in-domain knowledge.