Stefan Schweter, Alan Akbik
Current state-of-the-art approaches for named entity recognition (NER) typically consider text at the sentence-level and thus do not model information that crosses sentence boundaries. However, the use of transformer-based models for NER offers natural options for capturing document-level features. In this paper, we perform a comparative evaluation of document-level features in the two standard NER architectures commonly considered in the literature, namely "fine-tuning" and "feature-based LSTM-CRF". We evaluate different hyperparameters for document-level features such as context window size and enforcing document-locality. We present experiments from which we derive recommendations for how to model document context and present new state-of-the-art scores on several CoNLL-03 benchmark datasets. Our approach is integrated into the Flair framework to facilitate reproduction of our experiments.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Named Entity Recognition (NER) | CoNLL 2003 (German) | F1 | 88.34 | FLERT XLM-R |
| Named Entity Recognition (NER) | CoNLL 2003 (English) | F1 | 94.09 | FLERT XLM-R |
| Named Entity Recognition (NER) | FindVehicle | F1 Score | 80.9 | FLERT |
| Named Entity Recognition (NER) | CoNLL 2002 (Spanish) | F1 | 90.14 | FLERT XLM-R |
| Named Entity Recognition (NER) | CoNLL 2002 (Dutch) | F1 | 95.21 | FLERT XLM-R |
| Named Entity Recognition (NER) | CoNLL 2003 (German) Revised | F1 | 92.23 | FLERT XLM-R |