Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GPT-4 DUP | 94.2 | No | Achieving >97% on GSM8K: Deeply Understanding th... | 2024-04-23 | Code |
| 2 | DeBERTa | 63.5 | No | Math Word Problem Solving by Generating Linguist... | 2023-06-24 | Code |
| 3 | Graph2Tree with RoBERTa | 43.8 | Yes | Are NLP Models really able to Solve Simple Math ... | 2021-03-12 | Code |
| 4 | GTS with RoBERTa | 41 | Yes | Are NLP Models really able to Solve Simple Math ... | 2021-03-12 | Code |
| 5 | LSTM Seq2Seq with RoBERTa | 40.3 | Yes | Are NLP Models really able to Solve Simple Math ... | 2021-03-12 | Code |
| 6 | Transformer with RoBERTa | 38.9 | Yes | Are NLP Models really able to Solve Simple Math ... | 2021-03-12 | Code |