Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Reasoning
/
Math Word Problem Solving
/
SVAMP
Math Word Problem Solving on SVAMP
Metric: Execution Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Execution Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
GPT-4 (Teaching-Inspired)
93.9
No
Teaching-Inspired Integrated Prompting Framework...
2024-10-10
Code
2
GPT-4 (Model Selection)
93.7
No
Automatic Model Selection with Large Language Mo...
2023-05-23
Code
3
Qwen2(CoT + Code Interpreter)
92.3
No
-
-
-
4
GPT-4 (PHP)
91.9
No
Progressive-Hint Prompting Improves Reasoning in...
2023-04-19
Code
5
OpenMath-CodeLlama-70B (w/ code)
87.8
Yes
OpenMathInstruct-1: A 1.8 Million Math Instructi...
2024-02-15
Code
6
MathCoder-L-70B
84.9
Yes
MathCoder: Seamless Code Integration in LLMs for...
2023-10-05
Code
7
PoT_Eng (self-consistency @ 5)
83.7
No
-
-
Code
8
CoT_Eng (self-consistency @ 5)
82.5
No
-
-
Code
9
MMOS-CODE-34B(0-shot)
80.6
Yes
An Empirical Study of Data Ability Boundary in L...
2024-02-23
Code
10
MMOS-DeepSeekMath-7B(0-shot)
79.3
Yes
An Empirical Study of Data Ability Boundary in L...
2024-02-23
Code
11
MMOS-CODE-7B(0-shot)
76.4
Yes
An Empirical Study of Data Ability Boundary in L...
2024-02-23
Code
12
LLaMA 2-Chat
69.2
No
Llama 2: Open Foundation and Fine-Tuned Chat Mod...
2023-07-18
Code
13
DeBERTa
63.5
No
Math Word Problem Solving by Generating Linguist...
2023-06-24
Code
14
PaLM (zero-shot, CoT)
62.1
No
Large Language Models are Zero-Shot Reasoners
2022-05-24
Code
15
PaLM (zero-shot)
58.8
No
Large Language Models are Zero-Shot Reasoners
2022-05-24
Code
16
SYRELM (Vicuna 13B)
56.65
Yes
Frugal LMs Trained to Invoke Symbolic Solvers Ac...
2023-12-09
Code
17
ATHENA (roberta-large)
54.8
No
ATHENA: Mathematical Reasoning with Thought Expa...
2023-11-02
Code
18
MsAT-DeductReasoner
48.9
No
Learning Multi-Step Reasoning by Solving Arithme...
2023-06-02
Code
19
Roberta-DeductReasoner
47.3
No
Learning to Reason Deductively: Math Word Proble...
2022-03-19
Code
20
ATHENA (roberta-base)
45.6
No
ATHENA: Mathematical Reasoning with Thought Expa...
2023-11-02
Code
21
Graph2Tree with RoBERTa
43.8
Yes
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code
22
GTS with RoBERTa
41
Yes
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code
23
LSTM Seq2Seq with RoBERTa
40.3
Yes
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code
24
SYRELM (GPT-J)
40.1
Yes
Frugal LMs Trained to Invoke Symbolic Solvers Ac...
2023-12-09
Code
25
Transformer with RoBERTa
38.9
Yes
Are NLP Models really able to Solve Simple Math ...
2021-03-12
Code