Semantic Textual Similarity on STS Benchmark

Metric: Pearson Correlation (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	Pearson Correlation▼	Extra Data	Paper	Date↕	Code
1	MT-DNN-SMART	0.929	No	SMART: Robust and Efficient Fine-Tuning for Pre-...	2019-11-08	Code
2	StructBERTRoBERTa ensemble	0.928	No	StructBERT: Incorporating Language Structures in...	2019-08-13	-
3	Mnet-Sim	0.927	No	MNet-Sim: A Multi-layered Semantic Similarity Ne...	2021-11-09	-
4	T5-11B	0.925	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
5	ALBERT	0.925	Yes	ALBERT: A Lite BERT for Self-supervised Learning...	2019-09-26	Code
6	XLNet (single model)	0.925	No	XLNet: Generalized Autoregressive Pretraining fo...	2019-06-19	Code
7	RoBERTa	0.922	No	RoBERTa: A Robustly Optimized BERT Pretraining A...	2019-07-26	Code
8	ELECTRA	0.921	No	-	-	-
9	RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)	0.919	No	LLM.int8(): 8-bit Matrix Multiplication for Tran...	2022-08-15	Code
10	PSQ (Chen et al., 2020)	0.919	No	A Statistical Framework for Low-bitwidth Trainin...	2020-10-27	Code
11	RoBERTa-large 355M + Entailment as Few-shot Learner	0.918	No	Entailment as Few-Shot Learner	2021-04-29	Code
12	ERNIE 2.0 Large	0.912	No	ERNIE 2.0: A Continual Pre-training Framework fo...	2019-07-29	Code
13	Q-BERT (Shen et al., 2020)	0.911	No	Q-BERT: Hessian Based Ultra Low Precision Quanti...	2019-09-12	-
14	Q8BERT (Zafrir et al., 2019)	0.911	No	Q8BERT: Quantized 8Bit BERT	2019-10-14	Code
15	ELECTRA (no tricks)	0.91	No	-	-	-
16	DistilBERT 66M	0.907	No	DistilBERT, a distilled version of BERT: smaller...	2019-10-02	Code
17	T5-3B	0.906	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
18	MLM+ del-word	0.905	No	CLEAR: Contrastive Learning for Sentence Represe...	2020-12-31	-
19	RealFormer	0.9011	No	RealFormer: Transformer Likes Residual Attention	2020-12-21	Code
20	T5-Large	0.899	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
21	SpanBERT	0.899	No	SpanBERT: Improving Pre-training by Representing...	2019-07-24	Code
22	T5-Base	0.894	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
23	ERNIE 2.0 Base	0.876	No	ERNIE 2.0: A Continual Pre-training Framework fo...	2019-07-29	Code
24	Charformer-Tall	0.873	No	Charformer: Fast Character Transformers via Grad...	2021-06-23	Code
25	T5-Small	0.856	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
26	ERNIE	0.832	No	ERNIE: Enhanced Language Representation with Inf...	2019-05-17	Code
27	24hBERT	0.82	No	How to Train BERT with an Academic Budget	2021-04-15	Code
28	TinyBERT-4 14.5M	0.799	No	TinyBERT: Distilling BERT for Natural Language U...	2019-09-23	Code
29	USE_T	0.782	No	Universal Sentence Encoder	2018-03-29	Code

#1MT-DNN-SMARTSOTA
0.929
Pearson Correlation· 2019-11-08
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization Code
#2StructBERTRoBERTa ensembleSOTA
0.928
Pearson Correlation· 2019-08-13
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
#3Mnet-Sim
0.927
Pearson Correlation· 2021-11-09
MNet-Sim: A Multi-layered Semantic Similarity Network to Evaluate Sentence Similarity
#4T5-11B
0.925
Pearson Correlation· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Code
#5ALBERT
0.925
Pearson Correlation· Extra Data· 2019-09-26
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Code
#6XLNet (single model)SOTA
0.925
Pearson Correlation· 2019-06-19
XLNet: Generalized Autoregressive Pretraining for Language Understanding Code
#7RoBERTa
0.922
Pearson Correlation· 2019-07-26
RoBERTa: A Robustly Optimized BERT Pretraining Approach Code
#8ELECTRA
0.921
Pearson Correlation
No paper
#9RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)
0.919
Pearson Correlation· 2022-08-15
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Code
#10PSQ (Chen et al., 2020)
0.919
Pearson Correlation· 2020-10-27
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks Code
#11RoBERTa-large 355M + Entailment as Few-shot Learner
0.918
Pearson Correlation· 2021-04-29
Entailment as Few-Shot Learner Code
#12ERNIE 2.0 Large
0.912
Pearson Correlation· 2019-07-29
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding Code
#13Q-BERT (Shen et al., 2020)
0.911
Pearson Correlation· 2019-09-12
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
#14Q8BERT (Zafrir et al., 2019)
0.911
Pearson Correlation· 2019-10-14
Q8BERT: Quantized 8Bit BERT Code
#15ELECTRA (no tricks)
0.91
Pearson Correlation
No paper
#16DistilBERT 66M
0.907
Pearson Correlation· 2019-10-02
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Code
#17T5-3B
0.906
Pearson Correlation· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Code
#18MLM+ del-word
0.905
Pearson Correlation· 2020-12-31
CLEAR: Contrastive Learning for Sentence Representation
#19RealFormer
0.9011
Pearson Correlation· 2020-12-21
RealFormer: Transformer Likes Residual Attention Code
#20T5-Large
0.899
Pearson Correlation· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Code
#21SpanBERT
0.899
Pearson Correlation· 2019-07-24
SpanBERT: Improving Pre-training by Representing and Predicting Spans Code
#22T5-Base
0.894
Pearson Correlation· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Code
#23ERNIE 2.0 Base
0.876
Pearson Correlation· 2019-07-29
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding Code
#24Charformer-Tall
0.873
Pearson Correlation· 2021-06-23
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization Code
#25T5-Small
0.856
Pearson Correlation· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Code
#26ERNIESOTA
0.832
Pearson Correlation· 2019-05-17
ERNIE: Enhanced Language Representation with Informative Entities Code
#2724hBERT
0.82
Pearson Correlation· 2021-04-15
How to Train BERT with an Academic Budget Code
#28TinyBERT-4 14.5M
0.799
Pearson Correlation· 2019-09-23
TinyBERT: Distilling BERT for Natural Language Understanding Code
#29USE_TSOTA
0.782
Pearson Correlation· 2018-03-29
Universal Sentence Encoder Code