Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Methodology
/
Multi-Task Learning
/
MGSM
Multi-Task Learning on MGSM
Metric: Average (%) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Average (%) (best first)
Average (%) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Average (%)
▼
Augmentations
Paper
Date
↕
Code
1
PaLM 2 (few-shot, k=8, SC)
87
No
PaLM 2 Technical Report
2023-05-17
Code
2
PaLM 2 (8-shot, CoT)
72.2
No
PaLM 2 Technical Report
2023-05-17
Code
3
Flan-PaLM 540B (8-shot, fine-tuned, CoT + SC)
72
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
4
Flan-U-PaLM 540B (CoT)
60.4
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
5
Flan-PaLM 540B (8-shot, fine-tuned, CoT)
57
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
6
PaLM 540B
55
No
PaLM: Scaling Language Modeling with Pathways
2022-04-05
Code
7
U-PaLM 540B (CoT)
49.9
No
Transcending Scaling Laws with 0.1% Extra Compute
2022-10-20
-
8
text-davinci-003
36
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
9
code-davinci-002
35
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
10
text-davinci-002
23.7
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
11
Flan-PaLM 540B (8-shot, fine-tuned)
21.2
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
12
GPT-3 Davinci 175B
5.7
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
#1
PaLM 2 (few-shot, k=8, SC)
SOTA
87
Average (%)
· 2023-05-17
PaLM 2 Technical Report
Code
#2
PaLM 2 (8-shot, CoT)
72.2
Average (%)
· 2023-05-17
PaLM 2 Technical Report
Code
#3
Flan-PaLM 540B (8-shot, fine-tuned, CoT + SC)
SOTA
72
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#4
Flan-U-PaLM 540B (CoT)
60.4
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#5
Flan-PaLM 540B (8-shot, fine-tuned, CoT)
57
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#6
PaLM 540B
SOTA
55
Average (%)
· 2022-04-05
PaLM: Scaling Language Modeling with Pathways
Code
#7
U-PaLM 540B (CoT)
49.9
Average (%)
· 2022-10-20
Transcending Scaling Laws with 0.1% Extra Compute
#8
text-davinci-003
36
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#9
code-davinci-002
35
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#10
text-davinci-002
23.7
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#11
Flan-PaLM 540B (8-shot, fine-tuned)
21.2
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code
#12
GPT-3 Davinci 175B
5.7
Average (%)
· 2022-10-20
Scaling Instruction-Finetuned Language Models
Code