A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

Shamil Chollampatt, Hwee Tou Ng

2018-01-26Translation Grammatical Error Correction Language Modelling

Abstract

We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network. The network is initialized with embeddings that make use of character N-gram information to better suit this task. When evaluated on common benchmark test data sets (CoNLL-2014 and JFLEG), our model substantially outperforms all prior neural approaches on this task as well as strong statistical machine translation-based systems with neural and task-specific features trained on the same data. Our analysis shows the superiority of convolutional neural networks over recurrent neural networks such as long short-term memory (LSTM) networks in capturing the local context via attention, and thereby improving the coverage in correcting grammatical errors. By ensembling multiple models, and incorporating an N-gram language model and edit features via rescoring, our novel method becomes the first neural approach to outperform the current state-of-the-art statistical machine translation-based approach, both in terms of grammaticality and fluency.

Results

Task	Dataset	Metric	Value	Model
Grammatical Error Correction	CoNLL-2014 Shared Task	F0.5	54.79	CNN Seq2Seq
Grammatical Error Correction	JFLEG	GLEU	57.47	CNN Seq2Seq
Grammatical Error Correction	CoNLL-2014 Shared Task (10 annotations)	F0.5	70.14	CNN Seq2Seq
Grammatical Error Correction	_Restricted_	GLEU	57.47	CNN Seq2Seq

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17 Making Language Model a Hierarchical Classifier and Generator2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Assay2Mol: large language model-based drug design using BioAssay context2025-07-16 Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16