Bidirectional LSTM-CRF Models for Sequence Tagging

Zhiheng Huang, Wei Xu, Kai Yu

2015-08-09TAG POS Named Entity Recognition (NER)Chunking

Paper PDF Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code

Abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

Results

Task	Dataset	Metric	Value	Model
Named Entity Recognition (NER)	FindVehicle	F1 Score	49.5	BiLSTM-CRF
Chunking	Penn Treebank	F1 score	94.46	BI-LSTM-CRF (Senna) (ours)
Shallow Syntax	Penn Treebank	F1 score	94.46	BI-LSTM-CRF (Senna) (ours)

Related Papers

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling2025-07-10 CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs2025-07-09 CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation2025-07-08 Flippi: End To End GenAI Assistant for E-Commerce2025-07-08 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28 Can LLMs Replace Humans During Code Chunking?2025-06-24 CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation2025-06-24 LLMs in Coding and their Impact on the Commercial Software Engineering Landscape2025-06-19