Marek Rei
We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. The novel language modeling objective provided consistent performance improvements on every benchmark, without requiring any additional annotated or unannotated data.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Part-Of-Speech Tagging | Penn Treebank | Accuracy | 97.43 | Bi-LSTM + LMcost |
| Grammatical Error Correction | CoNLL-2014 A1 | F0.5 | 17.86 | Bi-LSTM + LMcost (trained on FCE) |
| Grammatical Error Correction | CoNLL-2014 A2 | F0.5 | 25.88 | Bi-LSTM + LMcost (trained on FCE) |
| Grammatical Error Correction | FCE | F0.5 | 48.48 | Bi-LSTM + LMcost |