Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Nils Reimers, Iryna Gurevych

2019-08-27IJCNLP 2019 11Sentence Embedding Transfer Learning Linear-Probe Classification Sentence Embeddings Semantic Similarity Semantic Textual Similarity Clustering STS

Abstract

BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.

Results

Task	Dataset	Metric	Value	Model
Semantic Textual Similarity	STS14	Spearman Correlation	0.7490000000000001	SBERT-NLI-large
Semantic Textual Similarity	STS15	Spearman Correlation	0.8185	SRoBERTa-NLI-large
Semantic Textual Similarity	SICK	Spearman Correlation	0.7462	SentenceBERT
Semantic Textual Similarity	SICK	Spearman Correlation	0.7446	SRoBERTa-NLI-base
Semantic Textual Similarity	SICK	Spearman Correlation	0.7429	SRoBERTa-NLI-large
Semantic Textual Similarity	SICK	Spearman Correlation	0.7375	SBERT-NLI-large
Semantic Textual Similarity	SICK	Spearman Correlation	0.7291	SBERT-NLI-base
Semantic Textual Similarity	STS13	Spearman Correlation	0.7846	SBERT-NLI-large
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8615	SRoBERTa-NLI-STSb-large
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8479	SBERT-STSb-base
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8445	SBERT-STSb-large
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.79	SBERT-NLI-large
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.7777	SRoBERTa-NLI-base
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.7703	SBERT-NLI-base
Semantic Textual Similarity	STS12	Spearman Correlation	0.7453	SRoBERTa-NLI-large
Semantic Textual Similarity	STS16	Spearman Correlation	0.7682	SRoBERTa-NLI-large

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Abstract

Results

Related Papers

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Abstract

Results

Related Papers