Natural Language Inference on WNLI

Metric: Accuracy (higher is better)

LeaderboardDataset

Loading chart...

Results

#	Model↕	Accuracy▼	Extra Data	Paper	Date↕	Code
1	Turing NLR v5 XXL 5.4B (fine-tuned)	95.9	No	Toward Efficient Language Model Pretraining and ...	2022-12-04	-
2	DeBERTa	94.5	No	DeBERTa: Decoding-enhanced BERT with Disentangle...	2020-06-05	Code
3	T5-XXL 11B	93.2	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
4	XLNet	92.5	No	XLNet: Generalized Autoregressive Pretraining fo...	2019-06-19	Code
5	ALBERT	91.8	No	ALBERT: A Lite BERT for Self-supervised Learning...	2019-09-26	Code
6	T5-XL 3B	89.7	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
7	StructBERTRoBERTa ensemble	89.7	No	StructBERT: Incorporating Language Structures in...	2019-08-13	-
8	HNNensemble	89	No	A Hybrid Neural Network Model for Commonsense Re...	2019-07-27	Code
9	RoBERTa (ensemble)	89	No	RoBERTa: A Robustly Optimized BERT Pretraining A...	2019-07-26	Code
10	T5-Large 770M	85.6	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
11	HNN	83.6	No	A Hybrid Neural Network Model for Commonsense Re...	2019-07-27	Code
12	T5-Base 220M	78.8	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
13	BERTwiki 340M (fine-tuned on WSCR)	74.7	No	A Surprisingly Robust Trick for Winograd Schema ...	2019-05-15	Code
14	FLAN 137B (zero-shot)	74.6	No	Finetuned Language Models Are Zero-Shot Learners	2021-09-03	Code
15	BERT-large 340M (fine-tuned on WSCR)	71.9	No	A Surprisingly Robust Trick for Winograd Schema ...	2019-05-15	Code
16	BERT-base 110M (fine-tuned on WSCR)	70.5	No	A Surprisingly Robust Trick for Winograd Schema ...	2019-05-15	Code
17	FLAN 137B (few-shot, k=4)	70.4	No	Finetuned Language Models Are Zero-Shot Learners	2021-09-03	Code
18	T5-Small 60M	69.2	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
19	ERNIE 2.0 Large	67.8	No	ERNIE 2.0: A Continual Pre-training Framework fo...	2019-07-29	Code
20	SqueezeBERT	65.1	No	SqueezeBERT: What can computer vision teach NLP ...	2020-06-19	Code
21	BERT-large 340M	65.1	No	BERT: Pre-training of Deep Bidirectional Transfo...	2018-10-11	Code
22	RWKV-4-Raven-14B	49.3	No	RWKV: Reinventing RNNs for the Transformer Era	2023-05-22	Code
23	DistilBERT 66M	44.4	No	DistilBERT, a distilled version of BERT: smaller...	2019-10-02	Code