Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Common Sense Reasoning
/
BIG-bench
Common Sense Reasoning on BIG-bench
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Accuracy (best first)
Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
Orca 2-13B
86.86
No
Orca 2: Teaching Small Language Models How to Re...
2023-11-18
-
2
Chinchilla-70B (few-shot, k=5)
85.7
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
3
Orca 2-7B
84.31
No
Orca 2: Teaching Small Language Models How to Re...
2023-11-18
-
4
Chinchilla-70B (few-shot, k=5)
75
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
5
Chinchilla-70B (few-shot, k=5)
73
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
6
Gopher-280B (few-shot, k=5)
69.7
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
7
Chinchilla-70B (few-shot, k=5)
68.8
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
8
Gopher-280B (few-shot, k=5)
68.2
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
9
Chinchilla-70B (few-shot, k=5)
67.7
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
10
Gopher-280B (few-shot, k=5)
56.8
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
11
Gopher-280B (few-shot, k=5)
52.5
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
12
Gopher-280B (few-shot, k=5)
50.9
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
13
Chinchilla-70B (few-shot, k=5)
13.1
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
14
Gopher-280B (few-shot, k=5)
11.7
No
Scaling Language Models: Methods, Analysis & Ins...
2021-12-08
Code
#1
Orca 2-13B
SOTA
86.86
Accuracy
· 2023-11-18
Orca 2: Teaching Small Language Models How to Reason
#2
Chinchilla-70B (few-shot, k=5)
SOTA
85.7
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#3
Orca 2-7B
84.31
Accuracy
· 2023-11-18
Orca 2: Teaching Small Language Models How to Reason
#4
Chinchilla-70B (few-shot, k=5)
75
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#5
Chinchilla-70B (few-shot, k=5)
73
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#6
Gopher-280B (few-shot, k=5)
SOTA
69.7
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code
#7
Chinchilla-70B (few-shot, k=5)
68.8
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#8
Gopher-280B (few-shot, k=5)
68.2
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code
#9
Chinchilla-70B (few-shot, k=5)
67.7
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#10
Gopher-280B (few-shot, k=5)
56.8
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code
#11
Gopher-280B (few-shot, k=5)
52.5
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code
#12
Gopher-280B (few-shot, k=5)
50.9
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code
#13
Chinchilla-70B (few-shot, k=5)
13.1
Accuracy
· 2022-03-29
Training Compute-Optimal Large Language Models
Code
#14
Gopher-280B (few-shot, k=5)
11.7
Accuracy
· 2021-12-08
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Code