TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AutoSense Model for Word Sense Induction

AutoSense Model for Word Sense Induction

Reinald Kim Amplayo, Seung-won Hwang, Min Song

2018-11-22Word Sense Induction
PaperPDFCode(official)

Abstract

Word sense induction (WSI), or the task of automatically discovering multiple senses or meanings of a word, has three main challenges: domain adaptability, novel sense detection, and sense granularity flexibility. While current latent variable models are known to solve the first two challenges, they are not flexible to different word sense granularities, which differ very much among words, from aardvark with one sense, to play with over 50 senses. Current models either require hyperparameter tuning or nonparametric induction of the number of senses, which we find both to be ineffective. Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word. These observations alleviate the problem by (a) throwing garbage senses and (b) additionally inducing fine-grained word senses. Results show great improvements over the state-of-the-art models on popular WSI datasets. We also show that AutoSense is able to learn the appropriate sense granularity of a word. Finally, we apply AutoSense to the unsupervised author name disambiguation task where the sense granularity problem is more evident and show that AutoSense is evidently better than competing models. We share our data and code here: https://github.com/rktamplayo/AutoSense.

Results

TaskDatasetMetricValueModel
Word Sense DisambiguationSemEval 2010 WSIAVG24.59AutoSense
Word Sense DisambiguationSemEval 2010 WSIF-Score61.7AutoSense
Word Sense DisambiguationSemEval 2010 WSIV-Measure9.8AutoSense
Word Sense InductionSemEval 2010 WSIAVG24.59AutoSense
Word Sense InductionSemEval 2010 WSIF-Score61.7AutoSense
Word Sense InductionSemEval 2010 WSIV-Measure9.8AutoSense

Related Papers

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models2024-06-28Multilingual Substitution-based Word Sense Induction2024-05-17The LSCD Benchmark: a Testbed for Diachronic Word Meaning Tasks2024-03-29A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change2024-02-19Word Sense Induction with Knowledge Distillation from BERT2023-04-20Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications2022-12-19Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization2022-10-11RuDSI: graph-based word sense induction dataset for Russian2022-09-28