Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Miscellaneous
/
General Knowledge
/
TheoremQA
General Knowledge on TheoremQA
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
GPT-4 (PoT)
52.4
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
2
GPT-4 (CoT)
43.8
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
3
GPT-3.5-turbo (PoT)
35.6
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
4
DART-Math-DSMath-7B-Uniform (0-shot CoT, w/o code)
32.5
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
5
DART-Math-DSMath-7B-Prop2Diff (0-shot CoT, w/o code)
32.2
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
6
PaLM-2-unicorn (CoT)
31.8
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
7
GPT-3.5-turbo (CoT)
30.2
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
8
DART-Math-Llama3-70B-Prop2Diff (0-shot CoT, w/o code)
28.2
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
9
DART-Math-Llama3-70B-Uniform (0-shot CoT, w/o code)
27.4
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
10
Claude-v1 (PoT)
25.9
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
11
Claude-v1 (CoT)
24.9
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
12
code-davinci-002
23.9
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
13
Claude-instant (CoT)
23.6
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
14
text-davinci-003
22.8
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
15
PaLM-2-bison (CoT)
21
No
TheoremQA: A Theorem-driven Question Answering d...
2023-05-21
Code
16
DART-Math-Llama3-8B-Prop2Diff (0-shot CoT, w/o code)
19.4
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
17
DART-Math-Mistral-7B-Prop2Diff (0-shot CoT, w/o code)
17
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
18
DART-Math-Mistral-7B-Uniform (0-shot CoT, w/o code)
16.4
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code
19
DART-Math-Llama3-8B-Uniform (0-shot CoT, w/o code)
15.4
Yes
DART-Math: Difficulty-Aware Rejection Tuning for...
2024-06-18
Code