Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes

Aina Garí Soler, Marianna Apidianaki, Alexandre Allauzen

2019-05-20SEMEVAL 2019 6Sentence Embeddings

Abstract

Usage similarity estimation addresses the semantic proximity of word instances in different contexts. We apply contextualized (ELMo and BERT) word and sentence embeddings to this task, and propose supervised models that leverage these representations for prediction. Our models are further assisted by lexical substitute annotations automatically assigned to word instances by context2vec, a neural model that relies on a bidirectional LSTM. We perform an extensive comparison of existing word and sentence representations on benchmark datasets addressing both graded and binary similarity. The best performing models outperform previous methods in both settings.

Related Papers

From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment2025-07-20 SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17 Intrinsic vs. Extrinsic Evaluation of Czech Sentence Embeddings: Semantic Relevance Doesn't Help with MT Evaluation2025-06-25 Do We Talk to Robots Like Therapists, and Do They Respond Accordingly? Language Alignment in AI Emotional Support2025-06-19 Mechanistic Decomposition of Sentence Representations2025-06-04 Rethinking the Understanding Ability across LLMs through Mutual Information2025-05-25 LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods2025-05-22 Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering2025-05-19