Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Natural Language Inference
/
WNLI
Natural Language Inference on WNLI
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
Turing NLR v5 XXL 5.4B (fine-tuned)
95.9
No
Toward Efficient Language Model Pretraining and ...
2022-12-04
-
2
DeBERTa
94.5
No
DeBERTa: Decoding-enhanced BERT with Disentangle...
2020-06-05
Code
3
T5-XXL 11B
93.2
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
4
XLNet
92.5
No
XLNet: Generalized Autoregressive Pretraining fo...
2019-06-19
Code
5
ALBERT
91.8
No
ALBERT: A Lite BERT for Self-supervised Learning...
2019-09-26
Code
6
T5-XL 3B
89.7
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
7
StructBERTRoBERTa ensemble
89.7
No
StructBERT: Incorporating Language Structures in...
2019-08-13
-
8
HNNensemble
89
No
A Hybrid Neural Network Model for Commonsense Re...
2019-07-27
Code
9
RoBERTa (ensemble)
89
No
RoBERTa: A Robustly Optimized BERT Pretraining A...
2019-07-26
Code
10
T5-Large 770M
85.6
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
11
HNN
83.6
No
A Hybrid Neural Network Model for Commonsense Re...
2019-07-27
Code
12
T5-Base 220M
78.8
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
13
BERTwiki 340M (fine-tuned on WSCR)
74.7
No
A Surprisingly Robust Trick for Winograd Schema ...
2019-05-15
Code
14
FLAN 137B (zero-shot)
74.6
No
Finetuned Language Models Are Zero-Shot Learners
2021-09-03
Code
15
BERT-large 340M (fine-tuned on WSCR)
71.9
No
A Surprisingly Robust Trick for Winograd Schema ...
2019-05-15
Code
16
BERT-base 110M (fine-tuned on WSCR)
70.5
No
A Surprisingly Robust Trick for Winograd Schema ...
2019-05-15
Code
17
FLAN 137B (few-shot, k=4)
70.4
No
Finetuned Language Models Are Zero-Shot Learners
2021-09-03
Code
18
T5-Small 60M
69.2
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
19
ERNIE 2.0 Large
67.8
No
ERNIE 2.0: A Continual Pre-training Framework fo...
2019-07-29
Code
20
SqueezeBERT
65.1
No
SqueezeBERT: What can computer vision teach NLP ...
2020-06-19
Code
21
BERT-large 340M
65.1
No
BERT: Pre-training of Deep Bidirectional Transfo...
2018-10-11
Code
22
RWKV-4-Raven-14B
49.3
No
RWKV: Reinventing RNNs for the Transformer Era
2023-05-22
Code
23
DistilBERT 66M
44.4
No
DistilBERT, a distilled version of BERT: smaller...
2019-10-02
Code