TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Hierarchical Meta-Embeddings for Code-Switching Named Enti...

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Genta Indra Winata, Zhaojiang Lin, Jamin Shin, Zihan Liu, Pascale Fung

2019-09-18IJCNLP 2019 11named-entity-recognitionNamed Entity RecognitionWord EmbeddingsNamed Entity Recognition (NER)
PaperPDFCode(official)

Abstract

In countries that speak multiple main languages, mixing up different languages within a conversation is commonly called code-switching. Previous works addressing this challenge mainly focused on word-level aspects such as word embeddings. However, in many cases, languages share common subwords, especially for closely related languages, but also for languages that are seemingly irrelevant. Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations. On the task of Named Entity Recognition for English-Spanish code-switching data, our model achieves the state-of-the-art performance in the multilingual settings. We also show that, in cross-lingual settings, our model not only leverages closely related languages, but also learns from languages with different roots. Finally, we show that combining different subunits are crucial for capturing code-switching entities.

Results

TaskDatasetMetricValueModel
Named Entity Recognition (NER)Code-Switching English-Spanish NERF169.17HME (word + BPE + char)

Related Papers

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Flippi: End To End GenAI Assistant for E-Commerce2025-07-08Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings2025-06-16Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform2025-06-11