GPT-3 (175B)

Reported on 4 benchmarks across 4 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Knowledge Base2 results

Mathematical Question AnsweringonMAWPS
Accuracy (%)
19.8
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))
Mathematical ReasoningonMAWPS
Accuracy (%)
19.8
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))

Natural Language Processing1 result

Question AnsweringonMAWPS
Accuracy (%)
19.8
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))

Reasoning1 result

Math Word Problem SolvingonMAWPS
Accuracy (%)
19.8
best: 95.7 (OpenMath-CodeLlama-70B (w/ code))