TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AnglE-optimized Text Embeddings

AnglE-optimized Text Embeddings

Xianming Li, Jing Li

2023-09-22Sentiment AnalysisSemantic Textual SimilarityLarge Language ModelSTSLanguage Modelling
PaperPDFCode(official)Code

Abstract

High-quality text embedding is pivotal in improving semantic textual similarity (STS) tasks, which are crucial components in Large Language Model (LLM) applications. However, a common challenge existing text embedding models face is the problem of vanishing gradients, primarily due to their reliance on the cosine function in the optimization objective, which has saturation zones. To address this issue, this paper proposes a novel angle-optimized text embedding model called AnglE. The core idea of AnglE is to introduce angle optimization in a complex space. This novel approach effectively mitigates the adverse effects of the saturation zone in the cosine function, which can impede gradient and hinder optimization processes. To set up a comprehensive STS evaluation, we experimented on existing short-text STS datasets and a newly collected long-text STS dataset from GitHub Issues. Furthermore, we examine domain-specific STS scenarios with limited labeled data and explore how AnglE works with LLM-annotated data. Extensive experiments were conducted on various tasks including short-text STS, long-text STS, and domain-specific STS tasks. The results show that AnglE outperforms the state-of-the-art (SOTA) STS models that ignore the cosine saturation zone. These findings demonstrate the ability of AnglE to generate high-quality text embeddings and the usefulness of angle optimization in STS.

Results

TaskDatasetMetricValueModel
Semantic Textual SimilaritySTS14Spearman Correlation0.8689AnglE-LLaMA-13B
Semantic Textual SimilaritySTS14Spearman Correlation0.8579AnglE-LLaMA-7B-v2
Semantic Textual SimilaritySTS14Spearman Correlation0.8549AnglE-LLaMA-7B
Semantic Textual SimilaritySTS15Spearman Correlation0.8956AnglE-LLaMA-13B
Semantic Textual SimilaritySTS15Spearman Correlation0.8943AnglE-LLaMA-7B-v2
Semantic Textual SimilaritySTS13Spearman Correlation0.9058AnglE-LLaMA-7B
Semantic Textual SimilaritySTS13Spearman Correlation0.9056AnglE-LLaMA-7B-v2
Semantic Textual SimilaritySTS BenchmarkSpearman Correlation0.8969AnglE-LLaMA-13B
Semantic Textual SimilaritySTS BenchmarkSpearman Correlation0.8897AnglE-LLaMA-7B
Semantic Textual SimilaritySTS BenchmarkSpearman Correlation0.8897AnglE-LLaMA-7B-v2
Semantic Textual SimilaritySTS12Spearman Correlation0.7868AnglE-LLaMA-7B
Semantic Textual SimilaritySTS12Spearman Correlation0.7868AnglE-LLaMA-13B
Semantic Textual SimilaritySTS16Spearman Correlation0.87AnglE-LLaMA-7B-v2
Semantic Textual SimilaritySTS16Spearman Correlation0.87AnglE-LLaMA-13B
Semantic Textual SimilaritySTS16Spearman Correlation0.8691AnglE-LLaMA-7B
Semantic Textual SimilaritySICK-RSpearman Correlation0.8094AnglE-LLaMA-7B
Semantic Textual SimilarityMTEBSpearman Correlation84.54AnglE-UAE
Sentiment AnalysisCRAccuracy93.54AnglE-LLaMA-7B
Sentiment AnalysisMRAccuracy91.09AnglE-LLaMA-7B

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17