TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Knowledge Base/Mathematical Reasoning/AIME24

Mathematical Reasoning on AIME24

Metric: Acc (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Acc▼Extra DataPaperDate↕Code
1Xolver94.4NoXolver: Multi-Agent Reasoning with Holistic Expe...2025-06-17Code
2DeepSeek-r179.8NoDeepSeek-R1: Incentivizing Reasoning Capability ...2025-01-22Code
3Openai-o174.4No---
4Openai-o1-mini70No---
5Search-o156.7NoSearch-o1: Agentic Search-Enhanced Large Reasoni...2025-01-09Code
6s1-32B56.7Nos1: Simple test-time scaling2025-01-31Code
7Openai-o1-preview44.6No---
8Qwen2.5-72B-Instruct23.3NoQwen2.5 Technical Report2024-12-19Code
9Claude3.5-Sonnet16No---