TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/To Err Is Human, but Llamas Can Learn It Too

To Err Is Human, but Llamas Can Learn It Too

Agnes Luhtaru, Taido Purason, Martin Vainikko, Maksym Del, Mark Fishel

2024-03-08Grammatical Error Correction
PaperPDFCode(official)

Abstract

This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs). Specifically, we fine-tune Llama 2-based LMs for error generation and find that this approach yields synthetic errors akin to human errors. Next, we train GEC Llama models with the help of these artificial errors and outperform previous state-of-the-art error correction models, with gains ranging between 0.8 and 6 F0.5 points across all tested languages (German, Ukrainian, and Estonian). Moreover, we demonstrate that generating errors by fine-tuning smaller sequence-to-sequence models and prompting large commercial LMs (GPT-3.5 and GPT-4) also results in synthetic errors beneficially affecting error generation models.

Results

TaskDatasetMetricValueModel
Grammatical Error CorrectionEstGEC-L2F0.569.97Llama + 1M BT + gold
Grammatical Error CorrectionFalko-MERLINF0.576.75Llama + 1M BT + gold
Grammatical Error CorrectionUA-GECF0.574.09Llama + 1M BT + gold

Related Papers

End-to-End Spoken Grammatical Error Correction2025-06-23IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator2025-06-03Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction2025-05-27gec-metrics: A Unified Library for Grammatical Error Correction Evaluation2025-05-26Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models2025-05-09Enriching the Korean Learner Corpus with Multi-reference Annotations and Rubric-Based Scoring2025-05-01Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments2025-03-31Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study2025-03-02