TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Universal NER: A Gold-Standard Multilingual Named Entity R...

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter

2023-11-15arXiv 2023 11named-entity-recognitionNamed Entity RecognitionNERMultilingual Named Entity RecognitionCross-Lingual NERNamed Entity Recognition (NER)
PaperPDFCodeCode

Abstract

We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse languages. In this paper, we detail the dataset creation and composition of UNER; we also provide initial modeling baselines on both in-language and cross-lingual learning settings. We release the data, code, and fitted models to the public.

Results

TaskDatasetMetricValueModel
Named Entity Recognition (NER)UNER v1 (Croatian)F1 (micro)93.6UNER XML-R
Named Entity Recognition (NER)UNER v1 (Portuguese)F1 (micro)90.4UNER XML-R
Named Entity Recognition (NER)UNER v1 (English)F1 (micro)86UNER XML-R
Named Entity Recognition (NER)UNER v1 - PUD (Swedish)F1 (micro)82.2UNER XML-R
Named Entity Recognition (NER)UNER v1 - PUD (English)F1 (micro)80.1UNER XML-R
Named Entity Recognition (NER)UNER v1 (Chinese)F1 (micro)89.5UNER XML-R
Named Entity Recognition (NER)UNER v1 (Serbian)F1 (micro)94.7UNER XML-R
Named Entity Recognition (NER)UNER v1 (Danish)F1 (micro)82.7UNER XML-R
Named Entity Recognition (NER)UNER v1 (Swedish)F1 (micro)88.3UNER XML-R
Named Entity Recognition (NER)UNER v1 - PUD (Chinese)F1 (micro)87.1UNER XML-R
Named Entity Recognition (NER)UNER v1 (Chinese Simplified)F1 (micro)89.4UNER XML-R
Named Entity Recognition (NER)UNER v1 (Slovak)F1 (micro)85.5UNER XML-R
Named Entity Recognition (NER)UNER v1 - PUD (Portuguese)F1 (micro)88.8UNER XML-R
Cross-LingualUNER v1 - PUD (German)F1 (micro)78.9UNER XML-R (all)
Cross-LingualUNER v1 (Serbian)F1 (micro)95.2UNER XML-R (all)
Cross-LingualUNER v1 - PUD (English)F1 (micro)79.5UNER XML-R (all)
Cross-LingualUNER v1 (Danish)F1 (micro)83UNER XML-R (all)
Cross-LingualUNER v1 - PUD (Swedish)F1 (micro)85.3UNER XML-R (all)
Cross-LingualUNER v1 (Cebuano)F1 (micro)69.6UNER XML-R (all)
Cross-LingualUNER v1 (Croatian)F1 (micro)90.9UNER XML-R (all)
Cross-LingualUNER v1 (Tagalog U)F1 (micro)63.8UNER XML-R (all)
Cross-LingualUNER v1 - PUD (Portuguese)F1 (micro)85.1UNER XML-R (all)
Cross-LingualUNER v1 - PUD (Russian)F1 (micro)70.6UNER XML-R (all)
Cross-LingualUNER v1 (Slovak)F1 (micro)81.6UNER XML-R (all)
Cross-LingualUNER v1 (Tagalog T)F1 (micro)91.3UNER XML-R (all)
Cross-LingualUNER v1 (Chinese Simplified)F1 (micro)87.7UNER XML-R (all)
Cross-LingualUNER v1 (Portuguese)F1 (micro)82.3UNER XML-R (all)
Cross-LingualUNER v1 (English)F1 (micro)82.8UNER XML-R (all)
Cross-LingualUNER v1 (Chinese)F1 (micro)88.2UNER XML-R (all)
Cross-LingualUNER v1 (Swedish)F1 (micro)88.2UNER XML-R (all)
Cross-LingualUNER v1 - PUD (Chinese)F1 (micro)86UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (German)F1 (micro)78.9UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Serbian)F1 (micro)95.2UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (English)F1 (micro)79.5UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Danish)F1 (micro)83UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (Swedish)F1 (micro)85.3UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Cebuano)F1 (micro)69.6UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Croatian)F1 (micro)90.9UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Tagalog U)F1 (micro)63.8UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (Portuguese)F1 (micro)85.1UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (Russian)F1 (micro)70.6UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Slovak)F1 (micro)81.6UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Tagalog T)F1 (micro)91.3UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Chinese Simplified)F1 (micro)87.7UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Portuguese)F1 (micro)82.3UNER XML-R (all)
Cross-Lingual TransferUNER v1 (English)F1 (micro)82.8UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Chinese)F1 (micro)88.2UNER XML-R (all)
Cross-Lingual TransferUNER v1 (Swedish)F1 (micro)88.2UNER XML-R (all)
Cross-Lingual TransferUNER v1 - PUD (Chinese)F1 (micro)86UNER XML-R (all)

Related Papers

Flippi: End To End GenAI Assistant for E-Commerce2025-07-08Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28Improving Named Entity Transcription with Contextual LLM-based Revision2025-06-12Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering2025-06-05Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective2025-06-05Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering2025-06-04EL4NER: Ensemble Learning for Named Entity Recognition via Multiple Small-Parameter Large Language Models2025-05-29Label-Guided In-Context Learning for Named Entity Recognition2025-05-29