GSM8K on GSM8K
Metric: Accuracy (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Xolver | 98.1 | No | Xolver: Multi-Agent Reasoning with Holistic Expe... | 2025-06-17 | Code |
| 2 | AlphaLLM (with MCTS) | 92 | No | Toward Self-Improvement of LLMs via Imagination,... | 2024-04-18 | Code |