AnglE-optimized Text Embeddings

Xianming Li, Jing Li

2023-09-22Sentiment Analysis Semantic Textual Similarity Large Language Model STS Language Modelling

Abstract

High-quality text embedding is pivotal in improving semantic textual similarity (STS) tasks, which are crucial components in Large Language Model (LLM) applications. However, a common challenge existing text embedding models face is the problem of vanishing gradients, primarily due to their reliance on the cosine function in the optimization objective, which has saturation zones. To address this issue, this paper proposes a novel angle-optimized text embedding model called AnglE. The core idea of AnglE is to introduce angle optimization in a complex space. This novel approach effectively mitigates the adverse effects of the saturation zone in the cosine function, which can impede gradient and hinder optimization processes. To set up a comprehensive STS evaluation, we experimented on existing short-text STS datasets and a newly collected long-text STS dataset from GitHub Issues. Furthermore, we examine domain-specific STS scenarios with limited labeled data and explore how AnglE works with LLM-annotated data. Extensive experiments were conducted on various tasks including short-text STS, long-text STS, and domain-specific STS tasks. The results show that AnglE outperforms the state-of-the-art (SOTA) STS models that ignore the cosine saturation zone. These findings demonstrate the ability of AnglE to generate high-quality text embeddings and the usefulness of angle optimization in STS.

Results

Task	Dataset	Metric	Value	Model
Semantic Textual Similarity	STS14	Spearman Correlation	0.8689	AnglE-LLaMA-13B
Semantic Textual Similarity	STS14	Spearman Correlation	0.8579	AnglE-LLaMA-7B-v2
Semantic Textual Similarity	STS14	Spearman Correlation	0.8549	AnglE-LLaMA-7B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8956	AnglE-LLaMA-13B
Semantic Textual Similarity	STS15	Spearman Correlation	0.8943	AnglE-LLaMA-7B-v2
Semantic Textual Similarity	STS13	Spearman Correlation	0.9058	AnglE-LLaMA-7B
Semantic Textual Similarity	STS13	Spearman Correlation	0.9056	AnglE-LLaMA-7B-v2
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8969	AnglE-LLaMA-13B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8897	AnglE-LLaMA-7B
Semantic Textual Similarity	STS Benchmark	Spearman Correlation	0.8897	AnglE-LLaMA-7B-v2
Semantic Textual Similarity	STS12	Spearman Correlation	0.7868	AnglE-LLaMA-7B
Semantic Textual Similarity	STS12	Spearman Correlation	0.7868	AnglE-LLaMA-13B
Semantic Textual Similarity	STS16	Spearman Correlation	0.87	AnglE-LLaMA-7B-v2
Semantic Textual Similarity	STS16	Spearman Correlation	0.87	AnglE-LLaMA-13B
Semantic Textual Similarity	STS16	Spearman Correlation	0.8691	AnglE-LLaMA-7B
Semantic Textual Similarity	SICK-R	Spearman Correlation	0.8094	AnglE-LLaMA-7B
Semantic Textual Similarity	MTEB	Spearman Correlation	84.54	AnglE-UAE
Sentiment Analysis	CR	Accuracy	93.54	AnglE-LLaMA-7B
Sentiment Analysis	MR	Accuracy	91.09	AnglE-LLaMA-7B

AnglE-optimized Text Embeddings

Abstract

Results

Related Papers

AnglE-optimized Text Embeddings

Abstract

Results

Related Papers