Majority-voting ensemble on best 7 models

Reported on 4 benchmarks across 1 task · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing4 results

Grammatical Error CorrectiononBEA-2019 (test)
F0.5· 2024-04-23
81.4
SOTA
Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models arXiv:2404.14914
Grammatical Error CorrectiononCoNLL-2014 Shared Task
F0.5· 2024-04-23
71.8
best: 72.8 (Ensembles of best 7 models + GRECO + GTP-rerank)
Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models arXiv:2404.14914
Grammatical Error CorrectiononCoNLL-2014 Shared Task
Precision· 2024-04-23
83.7
best: 83.9 (Ensembles of best 7 models + GRECO + GTP-rerank)
Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models arXiv:2404.14914
Grammatical Error CorrectiononCoNLL-2014 Shared Task
Recall· 2024-04-23
45.7
best: 53.8 (Unsupervised GEC + cLang8)
Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models arXiv:2404.14914