Evaluating the Utility of Hand-crafted Features in Sequence Labelling

Minghao Wu, Fei Liu, Trevor Cohn

2018-08-28EMNLP 2018 10named-entity-recognition Named Entity Recognition NER Named Entity Recognition (NER)

Abstract

Conventional wisdom is that hand-crafted features are redundant for deep learning models, as they already learn adequate representations of text automatically from corpora. In this work, we test this claim by proposing a new method for exploiting handcrafted features as part of a novel hybrid learning approach, incorporating a feature auto-encoder loss component. We evaluate on the task of named entity recognition (NER), where we show that including manual features for part-of-speech, word shapes and gazetteers can improve the performance of a neural CRF model. We obtain a $F_1$ of 91.89 for the CoNLL-2003 English shared task, which significantly outperforms a collection of highly competitive baseline models. We also present an ablation study showing the importance of auto-encoding, over using features as either inputs or outputs alone, and moreover, show including the autoencoder components reduces training requirements to 60\%, while retaining the same predictive accuracy.

Results

Task	Dataset	Metric	Value	Model
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	92.29	Neural-CRF+AE
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	91.87	CRF + AutoEncoder

Related Papers

Flippi: End To End GenAI Assistant for E-Commerce2025-07-08 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28 Improving Named Entity Transcription with Contextual LLM-based Revision2025-06-12 Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering2025-06-05 Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective2025-06-05 Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering2025-06-04 EL4NER: Ensemble Learning for Named Entity Recognition via Multiple Small-Parameter Large Language Models2025-05-29 Label-Guided In-Context Learning for Named Entity Recognition2025-05-29