TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Application of a Hybrid Bi-LSTM-CRF model to the task of R...

Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition

L. T. Anh, M. Y. Arkhipov, M. S. Burtsev

2017-09-27Word EmbeddingsNamed Entity Recognition (NER)
PaperPDFCodeCode

Abstract

Named Entity Recognition (NER) is one of the most common tasks of the natural language processing. The purpose of NER is to find and classify tokens in text documents into predefined categories called tags, such as person names, quantity expressions, percentage expressions, names of locations, organizations, as well as expression of time, currency and others. Although there is a number of approaches have been proposed for this task in Russian language, it still has a substantial potential for the better solutions. In this work, we studied several deep neural network models starting from vanilla Bi-directional Long Short-Term Memory (Bi-LSTM) then supplementing it with Conditional Random Fields (CRF) as well as highway networks and finally adding external word embeddings. All models were evaluated across three datasets: Gareev's dataset, Person-1000, FactRuEval-2016. We found that extension of Bi-LSTM model with CRF significantly increased the quality of predictions. Encoding input tokens with external word embeddings reduced training time and allowed to achieve state of the art for the Russian NER task.

Related Papers

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Flippi: End To End GenAI Assistant for E-Commerce2025-07-08Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings2025-06-16Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform2025-06-11