Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Knowledge Base
/
Mathematical Reasoning
/
MAWPS
Mathematical Reasoning on MAWPS
Metric: Accuracy (%) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Accuracy (%) (best first)
Accuracy (%) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Accuracy (%)
▼
Extra Data
Paper
Date
↕
Code
1
OpenMath-CodeLlama-70B (w/ code)
95.7
Yes
OpenMathInstruct-1: A 1.8 Million Math Instructi...
2024-02-15
Code
2
MsAT-DeductReasoner
94.3
No
Learning Multi-Step Reasoning by Solving Arithme...
2023-06-02
Code
3
ATHENA (roberta-large)
93
No
ATHENA: Mathematical Reasoning with Thought Expa...
2023-11-02
Code
4
Multi-view
92.3
Yes
Multi-View Reasoning: Consistent Contrastive Lea...
2022-10-21
Code
5
Exp-Tree
92.3
No
An Expression Tree Decoding Strategy for Mathema...
2023-10-14
Code
6
ATHENA (roberta-base)
92.2
No
ATHENA: Mathematical Reasoning with Thought Expa...
2023-11-02
Code
7
Roberta-DeductReasoner
92
No
Learning to Reason Deductively: Math Word Proble...
2022-03-19
Code
8
DeBERTa (PM + VM)
91
Yes
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
9
EPT
88.7
No
-
-
Code
10
Graph2Tree with RoBERTa
88.7
No
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code
11
GTS with RoBERTa
88.5
No
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code
12
GEO
85.1
No
-
-
-
13
EPT-X
84.57
No
-
-
Code
14
EPT
84.51
No
-
-
Code
15
Graph2Tree
83.7
No
-
-
Code
16
LLaMA 2-Chat
82.4
No
Llama 2: Open Foundation and Fine-Tuned Chat Mod...
2023-07-18
Code
17
GPT-3.5 turbo (175B)
80.3
No
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
18
Toolformer
44
No
-
-
-
19
GPT-3 (175B)
19.8
No
-
-
-
20
Toolformer (disabled)
15
No
-
-
-
21
GPT-J
9.9
No
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
22
GPT-J + CC
9.3
No
-
-
-
23
OPT (66B)
7.9
No
-
-
-
24
GPT-3 text-curie-001 (13B)
4.09
No
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
25
GPT-3 text-babbage-001 (6.7B)
2.76
No
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
#1
OpenMath-CodeLlama-70B (w/ code)
SOTA
95.7
Accuracy (%)
· Extra Data
· 2024-02-15
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Code
#2
MsAT-DeductReasoner
SOTA
94.3
Accuracy (%)
· 2023-06-02
Learning Multi-Step Reasoning by Solving Arithmetic Tasks
Code
#3
ATHENA (roberta-large)
93
Accuracy (%)
· 2023-11-02
ATHENA: Mathematical Reasoning with Thought Expansion
Code
#4
Multi-view
SOTA
92.3
Accuracy (%)
· Extra Data
· 2022-10-21
Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem
Code
#5
Exp-Tree
92.3
Accuracy (%)
· 2023-10-14
An Expression Tree Decoding Strategy for Mathematical Equation Generation
Code
#6
ATHENA (roberta-base)
92.2
Accuracy (%)
· 2023-11-02
ATHENA: Mathematical Reasoning with Thought Expansion
Code
#7
Roberta-DeductReasoner
SOTA
92
Accuracy (%)
· 2022-03-19
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction
Code
#8
DeBERTa (PM + VM)
91
Accuracy (%)
· Extra Data
· 2023-06-24
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Code
#9
EPT
88.7
Accuracy (%)
No paper
Code
#10
Graph2Tree with RoBERTa
SOTA
88.7
Accuracy (%)
· 2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?
Code
#11
GTS with RoBERTa
88.5
Accuracy (%)
· 2021-03-12
Are NLP Models really able to Solve Simple Math Word Problems?
Code
#12
GEO
85.1
Accuracy (%)
No paper
#13
EPT-X
84.57
Accuracy (%)
No paper
Code
#14
EPT
84.51
Accuracy (%)
No paper
Code
#15
Graph2Tree
83.7
Accuracy (%)
No paper
Code
#16
LLaMA 2-Chat
82.4
Accuracy (%)
· 2023-07-18
Llama 2: Open Foundation and Fine-Tuned Chat Models
Code
#17
GPT-3.5 turbo (175B)
80.3
Accuracy (%)
· 2023-06-24
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Code
#18
Toolformer
44
Accuracy (%)
No paper
#19
GPT-3 (175B)
19.8
Accuracy (%)
No paper
#20
Toolformer (disabled)
15
Accuracy (%)
No paper
#21
GPT-J
9.9
Accuracy (%)
· 2023-06-24
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Code
#22
GPT-J + CC
9.3
Accuracy (%)
No paper
#23
OPT (66B)
7.9
Accuracy (%)
No paper
#24
GPT-3 text-curie-001 (13B)
4.09
Accuracy (%)
· 2023-06-24
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Code
#25
GPT-3 text-babbage-001 (6.7B)
2.76
Accuracy (%)
· 2023-06-24
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Code