Robust Lexical Features for Improved Neural Network Named-Entity Recognition

Abbas Ghaddar, Philippe Langlais

2018-06-09COLING 2018 8named-entity-recognition Named Entity Recognition Named Entity Recognition (NER)

Abstract

Neural network approaches to Named-Entity Recognition reduce the need for carefully hand-crafted features. While some features do remain in state-of-the-art systems, lexical features have been mostly discarded, with the exception of gazetteers. In this work, we show that this is unfair: lexical features are actually quite useful. We propose to embed words and entity types into a low-dimensional vector space we train from annotated data produced by distant supervision thanks to Wikipedia. From this, we compute - offline - a feature vector representing each word. When used with a vanilla recurrent neural network model, this representation yields substantial improvements. We establish a new state-of-the-art F1 score of 87.95 on ONTONOTES 5.0, while matching state-of-the-art performance with a F1 score of 91.73 on the over-studied CONLL-2003 dataset.

Results

Task	Dataset	Metric	Value	Model
Named Entity Recognition (NER)	Ontonotes v5 (English)	F1	87.95	Bi-LSTM-CRF + Lexical Features
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	91.73	Bi-LSTM-CRF + Lexical Features

Related Papers

Flippi: End To End GenAI Assistant for E-Commerce2025-07-08 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28 Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering2025-06-05 Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective2025-06-05 Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering2025-06-04 EL4NER: Ensemble Learning for Named Entity Recognition via Multiple Small-Parameter Large Language Models2025-05-29 Label-Guided In-Context Learning for Named Entity Recognition2025-05-29 AmpleHate: Amplifying the Attention for Versatile Implicit Hate Detection2025-05-26