TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Miscellaneous/Automated Theorem Proving/miniF2F-test

Automated Theorem Proving on miniF2F-test

Metric: cumulative (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕cumulative▼Extra DataPaperDate↕Code
1Kimina-Prover-Preview80.74YesKimina-Prover Preview: Towards Large Formal Reas...2025-04-15Code
2ProofAug66NoEfficient Neural Theorem Proving via Fine-graine...2025-01-30Code
3DeepSeek-Prover-V1.563.5YesDeepSeek-Prover-V1.5: Harnessing Proof Assistant...2024-08-15Code
4Subgoal-XL56.1YesSubgoalXL: Subgoal-based Expert Learning for The...2024-08-20Code
5DeepSeek-Prover52YesDeepSeek-Prover: Advancing Theorem Proving in LL...2024-05-23-
6Lyra + GPT-447.1NoLyra: Orchestrating Dual Correction in Automated...2023-09-27Code
7LEGO-Prover ChatGPT47.1NoLEGO-Prover: Neural Theorem Proving with Growing...2023-10-01Code
8Decomposing the Enigma45.5NoDecomposing the Enigma: Subgoal-based Demonstrat...2023-05-25Code
9Evariste41YesHyperTree Proof Search for Neural Theorem Proving2022-05-23-
10Evariste-7d40.6NoHyperTree Proof Search for Neural Theorem Proving2022-05-23-
11Evariste-1d38.9NoHyperTree Proof Search for Neural Theorem Proving2022-05-23-
12DSP (540B Minerva informal)38.9NoDraft, Sketch, and Prove: Guiding Formal Theorem...2022-10-21Code
13Lean Expert Iteration36.6YesFormal Mathematics Statement Curriculum Learning2022-02-03Code
14GPT-f36.6NoHyperTree Proof Search for Neural Theorem Proving2022-05-23-
15Thor + expert iteration on autoformalised theorems35.2Yes---
16COPRA + GPT-4-turbo30.7NoAn In-Context Learning Agent for Formal Theorem-...2023-10-06Code
17Thor29.9NoThor: Wielding Hammers to Integrate Language Mod...2022-05-22-
18Lean GPT-f29.2NoMiniF2F: a cross-system benchmark for formal Oly...2021-08-31Code
19MMOS-DeepSeekMath-7B28.3NoAn Empirical Study of Data Ability Boundary in L...2024-02-23Code
20ReProver26.5No---
21LLEMMA-7b26.2NoLlemma: An Open Language Model For Mathematics2023-10-16Code
22LLEMMA-34b25.8NoLlemma: An Open Language Model For Mathematics2023-10-16Code
23PACT (reproduced by Thor)24.6NoProof Artifact Co-training for Theorem Proving w...2021-02-11Code
24COPRA + GPT-423.3NoAn In-Context Learning Agent for Formal Theorem-...2023-10-06Code
25Sledgehammer + heuristics20.9NoDraft, Sketch, and Prove: Guiding Formal Theorem...2022-10-21Code
26Lean tidy18NoMiniF2F: a cross-system benchmark for formal Oly...2021-08-31Code
27COPRA + GPT-3.511.9NoAn In-Context Learning Agent for Formal Theorem-...2023-10-06Code
28Sledgehammer10.4NoThor: Wielding Hammers to Integrate Language Mod...2022-05-22-
29Metamath GPT-f1.6NoMiniF2F: a cross-system benchmark for formal Oly...2021-08-31Code