Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Common Sense Reasoning
/
ARC (Easy)
Common Sense Reasoning on ARC (Easy)
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
ST-MoE-32B 269B (fine-tuned)
95.2
No
ST-MoE: Designing Stable and Transferable Sparse...
2022-02-17
Code
2
LLaMA 3 8B+MoSLoRA (fine-tuned)
90.5
No
Mixture-of-Subspaces in Low-Rank Adaptation
2024-06-16
Code
3
PaLM 2-L (1-shot)
89.7
No
PaLM 2 Technical Report
2023-05-17
Code
4
PaLM 2-M (1-shot)
88
No
PaLM 2 Technical Report
2023-05-17
Code
5
LLaMA-3 8B + MixLoRA
86.5
No
MixLoRA: Enhancing Large Language Models Fine-Tu...
2024-04-22
Code
6
Camelidae-8×34B
86.2
No
Parameter-Efficient Sparsity Crafting from Dense...
2024-01-05
Code
7
PaLM 2-S (1-shot)
85.6
No
PaLM 2 Technical Report
2023-05-17
Code
8
LLaMA 65B + CFG (0-shot)
84.2
No
Stay on topic with Classifier-Free Guidance
2023-06-30
-
9
GAL 120B (0-shot)
83.8
No
Galactica: A Large Language Model for Science
2022-11-16
Code
10
LLaMA-2 13B + MixLoRA
83.5
No
MixLoRA: Enhancing Large Language Models Fine-Tu...
2024-04-22
Code
11
LLaMA 30B + CFG (0-shot)
83.2
No
Stay on topic with Classifier-Free Guidance
2023-06-30
-
12
Mixtral 8x7B (0-shot)
83.1
No
Mixtral of Experts
2024-01-08
Code
13
FLAN 137B (few-shot, k=14)
80.7
No
Finetuned Language Models Are Zero-Shot Learners
2021-09-03
Code
14
Mistral 7B (0-shot)
80.5
No
Mixtral of Experts
2024-01-08
Code
15
LLaMA 33B (0-shot)
80
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
16
Mistral 7B (0-shot)
80
No
Mistral 7B
2023-10-10
Code
17
FLAN 137B (0-shot)
79.6
No
Finetuned Language Models Are Zero-Shot Learners
2021-09-03
Code
18
LLaMA 13B + CFG (0-shot)
79.1
No
Stay on topic with Classifier-Free Guidance
2023-06-30
-
19
LLaMA 65B (0-shot)
78.9
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
20
LLaMA-2 7B + MixLoRA
77.7
No
MixLoRA: Enhancing Large Language Models Fine-Tu...
2024-04-22
Code
21
phi-1.5-web 1.3B (0-shot)
76.1
No
Textbooks Are All You Need II: phi-1.5 technical...
2023-09-11
Code
22
BLOOM 176B (1-shot)
75.93
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
23
ST-MoE-L 4.1B (fine-tuned)
75.4
No
ST-MoE: Designing Stable and Transferable Sparse...
2022-02-17
Code
24
GLaM (64B/64E) (5-shot)
74.8
No
GLaM: Efficient Scaling of Language Models with ...
2021-12-13
-
25
LLaMA 13B (0-shot)
74.8
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
26
Bloomberg GPT 50B (1-shot)
73.99
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
27
LLaMA 7B (0-shot)
72.8
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
28
Pythia 12B (5-shot)
71.5
No
Pythia: A Suite for Analyzing Large Language Mod...
2023-04-03
Code
29
OPT 66B (1-shot)
71.25
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
30
GPT-3 175B (1 shot)
71.2
No
Language Models are Few-Shot Learners
2020-05-28
Code
31
OPT-175B
71.04
No
SparseGPT: Massive Language Models Can Be Accura...
2023-01-02
Code
32
GPT-NeoX 20B (1-shot)
70.79
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
33
Pythia 12B (0-shot)
70.2
No
Pythia: A Suite for Analyzing Large Language Mod...
2023-04-03
Code
34
UL2 20B (chain-of-thought + self-consistency)
69.8
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
35
Mamba-2.8B (0-shot)
69.7
No
Mamba: Linear-Time Sequence Modeling with Select...
2023-12-01
Code
36
SparseGPT 175B (50% sparsity)
69.65
No
SparseGPT: Massive Language Models Can Be Accura...
2023-01-02
Code
37
GPT-3 (zero-shot)
68.8
No
Galactica: A Large Language Model for Science
2022-11-16
Code
38
GPT-3 175B (0-shot)
68.8
No
Language Models are Few-Shot Learners
2020-05-28
Code
39
SparseGPT (175B, 4:8 Sparsity)
68.35
No
SparseGPT: Massive Language Models Can Be Accura...
2023-01-02
Code
40
GLaM 64B/64E (0-shot)
68
No
GLaM: Efficient Scaling of Language Models with ...
2021-12-13
-
41
SparseGPT 175B (2:4 sparsity)
67.08
No
SparseGPT: Massive Language Models Can Be Accura...
2023-01-02
Code
42
LLaMA 7B + CFG (0-shot)
58.9
No
Stay on topic with Classifier-Free Guidance
2023-06-30
-
43
BLOOM (5-shot)
40.7
No
Galactica: A Large Language Model for Science
2022-11-16
Code
44
UL2 20B (chain-of-thought)
38.4
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
45
OPT (5-shot)
37.4
No
Galactica: A Large Language Model for Science
2022-11-16
Code
46
UL2 20B (0-shot)
32.2
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
47
OPT 175B (50% Sparsity)
28.03
No
SparseGPT: Massive Language Models Can Be Accura...
2023-01-02
Code