EPT-X

Reported on 16 benchmarks across 4 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Knowledge Base8 results

Mathematical Question AnsweringonPEN
Accuracy (%)
69.59
Mathematical Question AnsweringonMAWPS
Accuracy (%)
84.57
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))
Mathematical Question AnsweringonALG514
Accuracy (%)
67.07
best: 83 (MixedSP)
Mathematical Question AnsweringonDRAW-1K
Accuracy (%)
56
best: 63.5 (EPT)
Mathematical ReasoningonPEN
Accuracy (%)
69.59
Mathematical ReasoningonMAWPS
Accuracy (%)
84.57
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))
Mathematical ReasoningonALG514
Accuracy (%)
67.07
best: 83 (MixedSP)
Mathematical ReasoningonDRAW-1K
Accuracy (%)
56
best: 63.5 (EPT)

Question AnsweringonPEN
Accuracy (%)
69.59
Question AnsweringonMAWPS
Accuracy (%)
84.57
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))
Question AnsweringonALG514
Accuracy (%)
67.07
best: 83 (MixedSP)
Question AnsweringonDRAW-1K
Accuracy (%)
56
best: 63.5 (EPT)

Math Word Problem SolvingonPEN
Accuracy (%)
69.59
Math Word Problem SolvingonMAWPS
Accuracy (%)
84.57
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))
Math Word Problem SolvingonALG514
Accuracy (%)
67.07
best: 83 (MixedSP)
Math Word Problem SolvingonDRAW-1K
Accuracy (%)
56
best: 63.5 (EPT)