TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MTEB: Massive Text Embedding Benchmark

MTEB: Massive Text Embedding Benchmark

Niklas Muennighoff, Nouamane Tazi, Loïc Magne, Nils Reimers

2022-10-13Text ClassificationRerankingBenchmarkingText SummarizationText RetrievalText ClusteringText Pair ClassificationSemantic Textual SimilarityInformation RetrievalText RerankingSTS
PaperPDFCodeCode(official)CodeCodeCode

Abstract

Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB). MTEB spans 8 embedding tasks covering a total of 58 datasets and 112 languages. Through the benchmarking of 33 models on MTEB, we establish the most comprehensive benchmark of text embeddings to date. We find that no particular text embedding method dominates across all tasks. This suggests that the field has yet to converge on a universal text embedding method and scale it up sufficiently to provide state-of-the-art results on all embedding tasks. MTEB comes with open-source code and a public leaderboard at https://github.com/embeddings-benchmark/mteb.

Results

TaskDatasetMetricValueModel
Semantic Textual SimilarityMTEBSpearman Correlation82.63ST5-XXL
Semantic Textual SimilarityMTEBSpearman Correlation81.83ST5-Large
Semantic Textual SimilarityMTEBSpearman Correlation81.66ST5-XL
Semantic Textual SimilarityMTEBSpearman Correlation81.14ST5-Base
Semantic Textual SimilarityMTEBSpearman Correlation80.73MPNet-multilingual
Semantic Textual SimilarityMTEBSpearman Correlation80.53SGPT-5.8B-nli
Semantic Textual SimilarityMTEBSpearman Correlation80.28MPNet
Semantic Textual SimilarityMTEBSpearman Correlation79.8MiniLM-L12
Semantic Textual SimilarityMTEBSpearman Correlation79.12SimCSE-BERT-sup
Semantic Textual SimilarityMTEBSpearman Correlation78.92MiniLM-L6
Semantic Textual SimilarityMTEBSpearman Correlation78.6Ada Similarity
Semantic Textual SimilarityMTEBSpearman Correlation78.38GTR-XXL
Semantic Textual SimilarityMTEBSpearman Correlation78.19GTR-Large
Semantic Textual SimilarityMTEBSpearman Correlation78.1SGPT-5.8B-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation77.8GTR-XL
Semantic Textual SimilarityMTEBSpearman Correlation77.74SGPT-BLOOM-7.1B-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation77.07GTR-Base
Semantic Textual SimilarityMTEBSpearman Correlation76.83SGPT-2.7B-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation76.47coCondenser-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation75.74SGPT-1.3B-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation74.71SGPT-125M-nli
Semantic Textual SimilarityMTEBSpearman Correlation74.33SimCSE-BERT-unsup
Semantic Textual SimilarityMTEBSpearman Correlation73.41SGPT-125M-msmarco
Semantic Textual SimilarityMTEBSpearman Correlation70.8LaBSE
Semantic Textual SimilarityMTEBSpearman Correlation62.47Komninos
Semantic Textual SimilarityMTEBSpearman Correlation61.85Glove
Semantic Textual SimilarityMTEBSpearman Correlation61.02SPECTER
Semantic Textual SimilarityMTEBSpearman Correlation55.32LASER2
Semantic Textual SimilarityMTEBSpearman Correlation54.36BERT
Text ClusteringMTEBV-Measure43.71ST5-XXL
Text ClusteringMTEBV-Measure43.69MPNet
Text ClusteringMTEBV-Measure42.42GTR-XXL
Text ClusteringMTEBV-Measure42.35MiniLM-L6
Text ClusteringMTEBV-Measure42.34ST5-XL
Text ClusteringMTEBV-Measure41.81MiniLM-L12
Text ClusteringMTEBV-Measure41.65ST5-Large
Text ClusteringMTEBV-Measure41.6GTR-Large
Text ClusteringMTEBV-Measure41.51GTR-XL
Text ClusteringMTEBV-Measure41.1Contriever
Text ClusteringMTEBV-Measure40.35SGPT-5.8B-msmarco
Text ClusteringMTEBV-Measure40.21ST5-Base
Text ClusteringMTEBV-Measure39.92SGPT-1.3B-msmarco
Text ClusteringMTEBV-Measure39.83SGPT-2.7B-msmarco
Text ClusteringMTEBV-Measure38.93SGPT-BLOOM-7.1B-msmarco
Text ClusteringMTEBV-Measure38.63GTR-Base
Text ClusteringMTEBV-Measure38.4MPNet-multilingual
Text ClusteringMTEBV-Measure37.64coCondenser-msmarco
Text ClusteringMTEBV-Measure37.52Ada Similarity
Text ClusteringMTEBV-Measure37.14MiniLM-L12-multilingual
Text ClusteringMTEBV-Measure36.98SGPT-5.8B-nli
Text ClusteringMTEBV-Measure35.79SGPT-125M-msmarco
Text ClusteringMTEBV-Measure34.06SPECTER
Text ClusteringMTEBV-Measure33.43SimCSE-BERT-sup
Text ClusteringMTEBV-Measure30.95SGPT-125M-nli
Text ClusteringMTEBV-Measure30.12BERT
Text ClusteringMTEBV-Measure29.55LaBSE
Text ClusteringMTEBV-Measure29.04SimCSE-BERT-unsup
Text ClusteringMTEBV-Measure27.73Glove
Text ClusteringMTEBV-Measure26.57Komninos
Text ClusteringMTEBV-Measure15.28LASER2
Text SummarizationMTEBSpearman Correlation31.57MPNet-multilingual
Text SummarizationMTEBSpearman Correlation31.39ST5-Base
Text SummarizationMTEBSpearman Correlation31.15SimCSE-BERT-unsup
Text SummarizationMTEBSpearman Correlation30.81MiniLM-L6
Text SummarizationMTEBSpearman Correlation30.67MiniLM-L12-multilingual
Text SummarizationMTEBSpearman Correlation30.64GTR-XXL
Text SummarizationMTEBSpearman Correlation30.49Komninos
Text SummarizationMTEBSpearman Correlation30.36Contriever
Text SummarizationMTEBSpearman Correlation30.26SGPT-125M-nli
Text SummarizationMTEBSpearman Correlation30.21GTR-XL
Text SummarizationMTEBSpearman Correlation30.08ST5-XXL
Text SummarizationMTEBSpearman Correlation29.91ST5-XL
Text SummarizationMTEBSpearman Correlation29.82BERT
Text SummarizationMTEBSpearman Correlation29.67GTR-Base
Text SummarizationMTEBSpearman Correlation29.64ST5-Large
Text SummarizationMTEBSpearman Correlation29.5coCondenser-msmarco
Text SummarizationMTEBSpearman Correlation28.87Glove
Text SummarizationMTEBSpearman Correlation27.9MiniLM-L12
Text SummarizationMTEBSpearman Correlation27.66SPECTER
Text SummarizationMTEBSpearman Correlation27.49MPNet
Text SummarizationMTEBSpearman Correlation26.94Ada Similarity
Text SummarizationMTEBSpearman Correlation26.8LASER2
Text SummarizationMTEBSpearman Correlation25.44SGPT-1.3B-msmarco
Text SummarizationMTEBSpearman Correlation24.99SGPT-BLOOM-7.1B-msmarco
Text SummarizationMTEBSpearman Correlation24.75SGPT-5.8B-msmarco
Text SummarizationMTEBSpearman Correlation23.31SimCSE-BERT-sup
Text ClassificationMTEBAccuracy73.42ST5-XXL
Text ClassificationMTEBAccuracy72.84ST5-XL
Text ClassificationMTEBAccuracy72.31ST5-Large
Text ClassificationMTEBAccuracy70.44Ada Similarity
Text ClassificationMTEBAccuracy70.14SGPT-5.8B-nli
Text ClassificationMTEBAccuracy69.81ST5-Base
Text ClassificationMTEBAccuracy68.13SGPT-5.8B-msmarco
Text ClassificationMTEBAccuracy67.91MPNet-multilingual
Text ClassificationMTEBAccuracy67.41GTR-XXL
Text ClassificationMTEBAccuracy67.32SimCSE-BERT-sup
Text ClassificationMTEBAccuracy67.14GTR-Large
Text ClassificationMTEBAccuracy67.13SGPT-2.7B-msmarco
Text ClassificationMTEBAccuracy67.11GTR-XL
Text ClassificationMTEBAccuracy66.68Contriever
Text ClassificationMTEBAccuracy66.52SGPT-1.3B-msmarco
Text ClassificationMTEBAccuracy66.19SGPT-BLOOM-7.1B-msmarco
Text ClassificationMTEBAccuracy65.25GTR-Base
Text ClassificationMTEBAccuracy65.07MPNet
Text ClassificationMTEBAccuracy64.71coCondenser-msmarco
Text ClassificationMTEBAccuracy64.3MiniLM-L12-multilingual
Text ClassificationMTEBAccuracy63.21MiniLM-L12
Text ClassificationMTEBAccuracy63.06MiniLM-L6
Text ClassificationMTEBAccuracy62.71LaBSE
Text ClassificationMTEBAccuracy62.5SimCSE-BERT-unsup
Text ClassificationMTEBAccuracy61.66BERT
Text ClassificationMTEBAccuracy61.46SGPT-125M-nli
Text ClassificationMTEBAccuracy60.72SGPT-125M-msmarco
Text ClassificationMTEBAccuracy57.65Komninos
Text ClassificationMTEBAccuracy57.29Glove
Text ClassificationMTEBAccuracy53.65LASER2
Text ClassificationMTEBAccuracy52.37SPECTER
Information RetrievalMTEBnDCG@1050.25SGPT-5.8B-msmarco
RetrievalMTEBnDCG@1050.25SGPT-5.8B-msmarco
RetrievalMTEBnDCG@1048.48GTR-XXL
RetrievalMTEBnDCG@1048.21SGPT-BLOOM-7.1B-msmarco
RetrievalMTEBnDCG@1047.96GTR-XL
RetrievalMTEBnDCG@1047.42GTR-Large
RetrievalMTEBnDCG@1046.54SGPT-2.7B-msmarco
RetrievalMTEBnDCG@1044.67GTR-Base
RetrievalMTEBnDCG@1044.49SGPT-1.3B-msmarco
RetrievalMTEBnDCG@1043.81MPNet
RetrievalMTEBnDCG@1042.69MiniLM-L12
RetrievalMTEBnDCG@1042.24ST5-XXL
RetrievalMTEBnDCG@1041.95MiniLM-L6
RetrievalMTEBnDCG@1041.88Contriever
RetrievalMTEBnDCG@1038.47ST5-XL
RetrievalMTEBnDCG@1037.04SGPT-125M-msmarco
RetrievalMTEBnDCG@1036.71ST5-Large
RetrievalMTEBnDCG@1035.34MPNet-multilingual
RetrievalMTEBnDCG@1033.63ST5-Base
RetrievalMTEBnDCG@1032.96coCondenser-msmarco
RetrievalMTEBnDCG@1032.45MiniLM-L12-multilingual
RetrievalMTEBnDCG@1032.34SGPT-5.8B-nli
RetrievalMTEBnDCG@1021.82SimCSE-BERT-sup
RetrievalMTEBnDCG@1021.62Glove
RetrievalMTEBnDCG@1021.22Komninos
RetrievalMTEBnDCG@1020.9SGPT-125M-nli
RetrievalMTEBnDCG@1020.29SimCSE-BERT-unsup
RetrievalMTEBnDCG@1018.99LaBSE
RetrievalMTEBnDCG@1015.88SPECTER
RetrievalMTEBnDCG@1010.59BERT
RetrievalMTEBnDCG@107.93LASER2
ClassificationMTEBAccuracy73.42ST5-XXL
ClassificationMTEBAccuracy72.84ST5-XL
ClassificationMTEBAccuracy72.31ST5-Large
ClassificationMTEBAccuracy70.44Ada Similarity
ClassificationMTEBAccuracy70.14SGPT-5.8B-nli
ClassificationMTEBAccuracy69.81ST5-Base
ClassificationMTEBAccuracy68.13SGPT-5.8B-msmarco
ClassificationMTEBAccuracy67.91MPNet-multilingual
ClassificationMTEBAccuracy67.41GTR-XXL
ClassificationMTEBAccuracy67.32SimCSE-BERT-sup
ClassificationMTEBAccuracy67.14GTR-Large
ClassificationMTEBAccuracy67.13SGPT-2.7B-msmarco
ClassificationMTEBAccuracy67.11GTR-XL
ClassificationMTEBAccuracy66.68Contriever
ClassificationMTEBAccuracy66.52SGPT-1.3B-msmarco
ClassificationMTEBAccuracy66.19SGPT-BLOOM-7.1B-msmarco
ClassificationMTEBAccuracy65.25GTR-Base
ClassificationMTEBAccuracy65.07MPNet
ClassificationMTEBAccuracy64.71coCondenser-msmarco
ClassificationMTEBAccuracy64.3MiniLM-L12-multilingual
ClassificationMTEBAccuracy63.21MiniLM-L12
ClassificationMTEBAccuracy63.06MiniLM-L6
ClassificationMTEBAccuracy62.71LaBSE
ClassificationMTEBAccuracy62.5SimCSE-BERT-unsup
ClassificationMTEBAccuracy61.66BERT
ClassificationMTEBAccuracy61.46SGPT-125M-nli
ClassificationMTEBAccuracy60.72SGPT-125M-msmarco
ClassificationMTEBAccuracy57.65Komninos
ClassificationMTEBAccuracy57.29Glove
ClassificationMTEBAccuracy53.65LASER2
ClassificationMTEBAccuracy52.37SPECTER

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Making Language Model a Hierarchical Classifier and Generator2025-07-17Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16