How does BERT capture semantics? A closer look at polysemous words

David Yenicelik, Florian Schmidt, Yannic Kilcher

2020-11-01EMNLP (BlackboxNLP) 2020 11Word Similarity Semantic Similarity Word Embeddings Word Sense Induction Semanticity prediction Word Sense Disambiguation

Paper PDF Code(official)

Abstract

The recent paradigm shift to contextual word embeddings has seen tremendous success across a wide range of down-stream tasks. However, little is known on how the emergent relation of context and semantics manifests geometrically. We investigate polysemous words as one particularly prominent instance of semantic organization. Our rigorous quantitative analysis of linear separability and cluster organization in embedding vectors produced by BERT shows that semantics do not surface as isolated clusters but form seamless structures, tightly coupled with sentiment and syntax.

Related Papers

SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17 Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09 SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression2025-07-08 FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection2025-07-06 LineRetriever: Planning-Aware Observation Reduction for Web Agents2025-06-30 Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30 Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval2025-06-26 DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning2025-06-26