Scaling Sentence Embeddings with Large Language Models

Ting Jiang, Shaohan Huang, Zhongzhi Luan, Deqing Wang, Fuzhen Zhuang

2023-07-31Sentence Embedding Sentence Embeddings Semantic Textual Similarity Contrastive Learning STS

Abstract

Large language models (LLMs) have recently garnered significant interest. With in-context learning, LLMs achieve impressive results in various natural language tasks. However, the application of LLMs to sentence embeddings remains an area of ongoing research. In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapting the previous prompt-based representation method for autoregressive models, constructing a demonstration set that enables LLMs to perform in-context learning, and scaling up the LLMs to different model sizes. Through extensive experiments, in-context learning enables LLMs to generate high-quality sentence embeddings without any fine-tuning. It helps LLMs achieve performance comparable to current contrastive learning methods. By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity (STS) tasks. However, the largest model outperforms other counterparts and achieves the new state-of-the-art result on transfer tasks. We also fine-tune LLMs with current contrastive learning approach, and the 2.7B OPT model, incorporating our prompt-based method, surpasses the performance of 4.8B ST5, achieving the new state-of-the-art results on STS tasks. Our code is available at https://github.com/kongds/scaling_sentemb.

Results

Task	Dataset	Metric	Value	Model
Semantic Textual Similarity	STS14	Spearman Correlation	0.8585	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS14	Spearman Correlation	0.8534	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS14	Spearman Correlation	0.848	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS15	Spearman Correlation	0.9004	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8952	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8951	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8238	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8206	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8129	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS13	Spearman Correlation	0.9025	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS13	Spearman Correlation	0.9024	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS13	Spearman Correlation	0.8964	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8914	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8856	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8833	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS12	Spearman Correlation	0.802	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS12	Spearman Correlation	0.7972	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS12	Spearman Correlation	0.7949	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS16	Spearman Correlation	0.8627	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS16	Spearman Correlation	0.8591	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS16	Spearman Correlation	0.859	PromptEOL+CSE+OPT-13B

Abstract

Results

Task	Dataset	Metric	Value	Model
Semantic Textual Similarity	STS14	Spearman Correlation	0.8585	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS14	Spearman Correlation	0.8534	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS14	Spearman Correlation	0.848	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS15	Spearman Correlation	0.9004	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8952	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8951	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8238	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8206	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	SICK	Spearman Correlation	0.8129	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS13	Spearman Correlation	0.9025	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS13	Spearman Correlation	0.9024	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS13	Spearman Correlation	0.8964	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8914	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8856	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8833	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS12	Spearman Correlation	0.802	PromptEOL+CSE+OPT-13B
Semantic Textual Similarity	STS12	Spearman Correlation	0.7972	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS12	Spearman Correlation	0.7949	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS16	Spearman Correlation	0.8627	PromptEOL+CSE+LLaMA-30B
Semantic Textual Similarity	STS16	Spearman Correlation	0.8591	PromptEOL+CSE+OPT-2.7B
Semantic Textual Similarity	STS16	Spearman Correlation	0.859	PromptEOL+CSE+OPT-13B

Scaling Sentence Embeddings with Large Language Models

Abstract

Results

Related Papers

Scaling Sentence Embeddings with Large Language Models

Abstract

Results

Related Papers