Code Generation on Livecodebench

Metric: Acc (higher is better)

LeaderboardDataset
Loading chart...
#ModelAccExtra DataPaperDateCode
1Xolver91.6NoXolver: Multi-Agent Reasoning with Holistic Expe...2025-06-17Code
2LPW (GPT-4o)59.3NoPlanning-Driven Programming: A Large Language Mo...2024-11-21Code
3Search-o133YesSearch-o1: Agentic Search-Enhanced Large Reasoni...2025-01-09Code