Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Natural Language Inference
/
ANLI test
Natural Language Inference on ANLI test
Metric: A3 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
A3
▼
Extra Data
Paper
Date
↕
Code
1
T5-3B (explanation prompting)
74.8
No
-
-
-
2
PaLM 540B (Self Improvement, Self Consistency)
67.9
No
Large Language Models Can Self-Improve
2022-10-20
-
3
PaLM 540B (Self Improvement, CoT Prompting)
67.3
No
Large Language Models Can Self-Improve
2022-10-20
-
4
PaLM 2-L (one-shot)
67.1
No
PaLM 2 Technical Report
2023-05-17
Code
5
PaLM 540B (Self Improvement, Standard-Prompting)
66.9
No
Large Language Models Can Self-Improve
2022-10-20
-
6
PaLM 540B (Self Consistency)
63.4
No
Large Language Models Can Self-Improve
2022-10-20
-
7
PaLM 540B (CoT Prompting)
60.6
No
Large Language Models Can Self-Improve
2022-10-20
-
8
T0-11B (explanation prompting)
59.9
No
-
-
-
9
PaLM 540B (Standard-Prompting)
55.8
No
Large Language Models Can Self-Improve
2022-10-20
-
10
PaLM 2-M (one-shot)
54.5
No
PaLM 2 Technical Report
2023-05-17
Code
11
ChatGPT
54.1
No
A Systematic Study and Comprehensive Evaluation ...
2023-05-29
Code
12
PaLM 2-S (one-shot)
53.2
No
PaLM 2 Technical Report
2023-05-17
Code
13
XLNet (Large)
49.4
Yes
XLNet: Generalized Autoregressive Pretraining fo...
2019-06-19
Code
14
ALUM (RoBERTa-LARGE)
48.4
Yes
Adversarial Training for Large Neural Language M...
2020-04-20
Code
15
InfoBERT (RoBERTa)
47.7
Yes
InfoBERT: Improving Robustness of Language Model...
2020-10-05
Code
16
RoBERTa (Large)
44.4
Yes
RoBERTa: A Robustly Optimized BERT Pretraining A...
2019-07-26
Code
17
T0-3B (CoT fine-tuned)
41.9
No
The CoT Collection: Improving Zero-shot and Few-...
2023-05-23
Code
18
GPT-3
40.2
Yes
Language Models are Few-Shot Learners
2020-05-28
Code
19
Flipped-3B
37.73
No
Guess the Instruction! Flipped Learning Makes La...
2022-10-06
Code
20
KiC-770M
37.6
No
Knowledge-in-Context: Towards Knowledgeable Semi...
2022-10-28
-
21
Bloomberg GPT (one-shot)
37.33
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
22
GPT-NeoX (one-shot)
36.17
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
23
BLOOM 176B (one-shot)
35.17
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
24
OPT 66B (one-shot)
34.92
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
25
RoE-3B
31.22
No
Exploring the Benefits of Training Expert Langua...
2023-02-07
Code