TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Towards better substitution-based word sense induction

Towards better substitution-based word sense induction

Asaf Amrami, Yoav Goldberg

2019-05-29ClusteringWord Sense Induction
PaperPDFCodeCode(official)

Abstract

Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses. Recent work obtain strong results by clustering lexical substitutes derived from pre-trained RNN language models (ELMo). Adapting the method to BERT improves the scores even further. We extend the previous method to support a dynamic rather than a fixed number of clusters as supported by other prominent methods, and propose a method for interpreting the resulting clusters by associating them with their most informative substitutes. We then perform extensive error analysis revealing the remaining sources of errors in the WSI task. Our code is available at https://github.com/asafamr/bertwsi.

Results

TaskDatasetMetricValueModel
Word Sense DisambiguationSemEval 2010 WSIAVG53.6BERT+DP
Word Sense DisambiguationSemEval 2010 WSIF-Score71.3BERT+DP
Word Sense DisambiguationSemEval 2010 WSIV-Measure40.4BERT+DP
Word Sense InductionSemEval 2010 WSIAVG53.6BERT+DP
Word Sense InductionSemEval 2010 WSIF-Score71.3BERT+DP
Word Sense InductionSemEval 2010 WSIV-Measure40.4BERT+DP

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Ranking Vectors Clustering: Theory and Applications2025-07-16Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning2025-07-09Consistency and Inconsistency in $K$-Means Clustering2025-07-08MC-INR: Efficient Encoding of Multivariate Scientific Simulation Data using Meta-Learning and Clustered Implicit Neural Representations2025-07-03Supercm: Revisiting Clustering for Semi-Supervised Learning2025-06-30Temporal Rate Reduction Clustering for Human Motion Segmentation2025-06-26