TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FCGEC: Fine-Grained Corpus for Chinese Grammatical Error C...

FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction

Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Jiayu Fu, Ming Cai

2022-10-22Grammatical Error CorrectionGrammatical Error Detection
PaperPDFCode(official)Code

Abstract

Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category and scale. In this paper, we present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors. FCGEC is a human-annotated corpus with multiple references, consisting of 41,340 sentences collected mainly from multi-choice questions in public school Chinese examinations. Furthermore, we propose a Switch-Tagger-Generator (STG) baseline model to correct the grammatical errors in low-resource settings. Compared to other GEC benchmark models, experimental results illustrate that STG outperforms them on our FCGEC. However, there exists a significant gap between benchmark models and humans that encourages future models to bridge it.

Results

TaskDatasetMetricValueModel
Grammatical Error CorrectionFCGECF0.545.48STG-Joint
Grammatical Error CorrectionFCGECexact match34.1STG-Joint

Related Papers

End-to-End Spoken Grammatical Error Correction2025-06-23IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator2025-06-03Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction2025-05-27gec-metrics: A Unified Library for Grammatical Error Correction Evaluation2025-05-26Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models2025-05-09Detecting Spelling and Grammatical Anomalies in Russian Poetry Texts2025-05-07Enriching the Korean Learner Corpus with Multi-reference Annotations and Rubric-Based Scoring2025-05-01ARWI: Arabic Write and Improve2025-04-16