TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Boosting Entity Linking Performance by Leveraging Unlabele...

Boosting Entity Linking Performance by Leveraging Unlabeled Documents

Phong Le, Ivan Titov

2019-06-04ACL 2019 7Entity LinkingEntity Disambiguation
PaperPDFCode(official)

Abstract

Modern entity linking systems rely on large collections of documents specifically annotated for the task (e.g., AIDA CoNLL). In contrast, we propose an approach which exploits only naturally occurring information: unlabeled documents and Wikipedia. Our approach consists of two stages. First, we construct a high recall list of candidate entities for each mention in an unlabeled document. Second, we use the candidate lists as weak supervision to constrain our document-level entity linking model. The model treats entities as latent variables and, when estimated on a collection of unlabelled texts, learns to choose entities relying both on local context of each mention and on coherence with other entities in the document. The resulting approach rivals fully-supervised state-of-the-art systems on standard test sets. It also approaches their performance in the very challenging setting: when tested on a test set sampled from the data used to estimate the supervised systems. By comparing to Wikipedia-only training of our model, we demonstrate that modeling unlabeled documents is beneficial.

Results

TaskDatasetMetricValueModel
Entity DisambiguationAIDA-CoNLLIn-KB Accuracy89.66Le& Titov (2019) (Le and Titov, 2019)

Related Papers

AI's Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario2025-06-20LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World2025-06-01Verify-in-the-Graph: Entity Disambiguation Enhancement for Complex Claim Verification with Interactive Graph Representation2025-05-29Distilling Closed-Source LLM's Knowledge for Locally Stable and Economic Biomedical Entity Linking2025-05-26RoleRAG: Enhancing LLM Role-Playing via Graph Guided Retrieval2025-05-24Evaluating Design Decisions for Dual Encoder-based Entity Disambiguation2025-05-16A Grounded Memory System For Smart Personal Assistants2025-05-09Evaluation of LLMs on Long-tail Entity Linking in Historical Documents2025-05-06