Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models

Kostiantyn Omelianchuk, Andrii Liubonko, Oleksandr Skurzhanskyi, Artem Chernodub, Oleksandr Korniienko, Igor Samokhin

2024-04-23Grammatical Error Correction

Abstract

In this paper, we carry out experimental research on Grammatical Error Correction, delving into the nuances of single-model systems, comparing the efficiency of ensembling and ranking methods, and exploring the application of large language models to GEC as single-model systems, as parts of ensembles, and as ranking methods. We set new state-of-the-art performance with F_0.5 scores of 72.8 on CoNLL-2014-test and 81.4 on BEA-test, respectively. To support further advancements in GEC and ensure the reproducibility of our research, we make our code, trained models, and systems' outputs publicly available.

Results

Task	Dataset	Metric	Value	Model
Grammatical Error Correction	CoNLL-2014 Shared Task	F0.5	72.8	Ensembles of best 7 models + GRECO + GTP-rerank
Grammatical Error Correction	CoNLL-2014 Shared Task	Precision	83.9	Ensembles of best 7 models + GRECO + GTP-rerank
Grammatical Error Correction	CoNLL-2014 Shared Task	Recall	47.5	Ensembles of best 7 models + GRECO + GTP-rerank
Grammatical Error Correction	CoNLL-2014 Shared Task	F0.5	71.8	Majority-voting ensemble on best 7 models
Grammatical Error Correction	CoNLL-2014 Shared Task	Precision	83.7	Majority-voting ensemble on best 7 models
Grammatical Error Correction	CoNLL-2014 Shared Task	Recall	45.7	Majority-voting ensemble on best 7 models
Grammatical Error Correction	BEA-2019 (test)	F0.5	81.4	Majority-voting ensemble on best 7 models

Related Papers

End-to-End Spoken Grammatical Error Correction2025-06-23 IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator2025-06-03 Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction2025-05-27 gec-metrics: A Unified Library for Grammatical Error Correction Evaluation2025-05-26 Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models2025-05-09 Enriching the Korean Learner Corpus with Multi-reference Annotations and Rubric-Based Scoring2025-05-01 Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments2025-03-31 Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study2025-03-02