TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/The impact of lexical and grammatical processing on genera...

The impact of lexical and grammatical processing on generating code from natural language

Nathanaël Beau, Benoît Crabbé

2022-02-28Findings (ACL) 2022 5Code TranslationTranslationCode Generation
PaperPDFCode(official)Code(official)

Abstract

Considering the seq2seq architecture of TranX for natural language to code translation, we identify four key components of importance: grammatical constraints, lexical preprocessing, input representations, and copy mechanisms. To study the impact of these components, we use a state-of-the-art architecture that relies on BERT encoder and a grammar-based decoder for which a formalization is provided. The paper highlights the importance of the lexical substitution component in the current natural language to code systems.

Results

TaskDatasetMetricValueModel
Code GenerationCoNaLaBLEU34.2TranX + BERT w/mined
Code GenerationCoNaLaExact Match Accuracy5.8TranX + BERT w/mined
Code GenerationDjangoAccuracy81.03TranX + BERT w/mined
Code GenerationDjangoBLEU Score79.86TranX + BERT w/mined

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Towards Formal Verification of LLM-Generated Code from Natural Language Prompts2025-07-17MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks2025-07-16Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training2025-07-16Function-to-Style Guidance of LLMs for Code Translation2025-07-15The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding2025-07-14