DeepSeek-r1

Reported on 1 benchmark across 1 task · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Knowledge Base1 result

Mathematical ReasoningonAIME24
Acc· 2025-01-22
79.8
best: 94.4 (Xolver)
SOTA
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning arXiv:2501.12948