TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unlocking Compositional Generalization in Pre-trained Mode...

Unlocking Compositional Generalization in Pre-trained Models Using Intermediate Representations

Jonathan Herzig, Peter Shaw, Ming-Wei Chang, Kelvin Guu, Panupong Pasupat, Yuan Zhang

2021-04-15Semantic ParsingText-To-SQL
PaperPDFCodeCode(official)

Abstract

Sequence-to-sequence (seq2seq) models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization. While specialized model architectures and pre-training of seq2seq models have been proposed to address this issue, the former often comes at the cost of generality and the latter only shows limited success. In this paper, we study the impact of intermediate representations on compositional generalization in pre-trained seq2seq models, without changing the model architecture at all, and identify key aspects for designing effective representations. Instead of training to directly map natural language to an executable form, we map to a reversible or lossy intermediate representation that has stronger structural correspondence with natural language. The combination of our proposed intermediate representations and pre-trained models is surprisingly effective, where the best combinations obtain a new state-of-the-art on CFQ (+14.8 accuracy points) and on the template-splits of three text-to-SQL datasets (+15.0 to +19.4 accuracy points). This work highlights that intermediate representations provide an important and potentially overlooked degree of freedom for improving the compositional generalization abilities of pre-trained seq2seq models.

Results

TaskDatasetMetricValueModel
Semantic ParsingCFQExact Match83.8T5-3B w/ Intermediate Representations

Related Papers

CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL2025-07-07Where, What, Why: Towards Explainable Driver Attention Prediction2025-06-29SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications2025-06-23Schema-R1: A reasoning training approach for schema linking in Text-to-SQL Task2025-06-13Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA2025-06-11LLM-Driven Data Generation and a Novel Soft Metric for Evaluating Text-to-SQL in Aviation MRO2025-06-11HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration2025-06-11