Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Miscellaneous
/
Mathematical Proofs
/
miniF2F-test
Mathematical Proofs on miniF2F-test
Metric: cumulative (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
cumulative (best first)
cumulative (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
cumulative
▼
Extra Data
Paper
Date
↕
Code
1
Kimina-Prover-Preview
80.74
Yes
Kimina-Prover Preview: Towards Large Formal Reas...
2025-04-15
Code
2
ProofAug
66
No
Efficient Neural Theorem Proving via Fine-graine...
2025-01-30
Code
3
DeepSeek-Prover-V1.5
63.5
Yes
DeepSeek-Prover-V1.5: Harnessing Proof Assistant...
2024-08-15
Code
4
Subgoal-XL
56.1
Yes
SubgoalXL: Subgoal-based Expert Learning for The...
2024-08-20
Code
5
DeepSeek-Prover
52
Yes
DeepSeek-Prover: Advancing Theorem Proving in LL...
2024-05-23
-
6
Lyra + GPT-4
47.1
No
Lyra: Orchestrating Dual Correction in Automated...
2023-09-27
Code
7
LEGO-Prover ChatGPT
47.1
No
LEGO-Prover: Neural Theorem Proving with Growing...
2023-10-01
Code
8
Decomposing the Enigma
45.5
No
Decomposing the Enigma: Subgoal-based Demonstrat...
2023-05-25
Code
9
Evariste
41
Yes
HyperTree Proof Search for Neural Theorem Proving
2022-05-23
-
10
Evariste-7d
40.6
No
HyperTree Proof Search for Neural Theorem Proving
2022-05-23
-
11
Evariste-1d
38.9
No
HyperTree Proof Search for Neural Theorem Proving
2022-05-23
-
12
DSP (540B Minerva informal)
38.9
No
Draft, Sketch, and Prove: Guiding Formal Theorem...
2022-10-21
Code
13
Lean Expert Iteration
36.6
Yes
Formal Mathematics Statement Curriculum Learning
2022-02-03
Code
14
GPT-f
36.6
No
HyperTree Proof Search for Neural Theorem Proving
2022-05-23
-
15
Thor + expert iteration on autoformalised theorems
35.2
Yes
-
-
-
16
COPRA + GPT-4-turbo
30.7
No
An In-Context Learning Agent for Formal Theorem-...
2023-10-06
Code
17
Thor
29.9
No
Thor: Wielding Hammers to Integrate Language Mod...
2022-05-22
-
18
Lean GPT-f
29.2
No
MiniF2F: a cross-system benchmark for formal Oly...
2021-08-31
Code
19
MMOS-DeepSeekMath-7B
28.3
No
An Empirical Study of Data Ability Boundary in L...
2024-02-23
Code
20
ReProver
26.5
No
-
-
-
21
LLEMMA-7b
26.2
No
Llemma: An Open Language Model For Mathematics
2023-10-16
Code
22
LLEMMA-34b
25.8
No
Llemma: An Open Language Model For Mathematics
2023-10-16
Code
23
PACT (reproduced by Thor)
24.6
No
Proof Artifact Co-training for Theorem Proving w...
2021-02-11
Code
24
COPRA + GPT-4
23.3
No
An In-Context Learning Agent for Formal Theorem-...
2023-10-06
Code
25
Sledgehammer + heuristics
20.9
No
Draft, Sketch, and Prove: Guiding Formal Theorem...
2022-10-21
Code
26
Lean tidy
18
No
MiniF2F: a cross-system benchmark for formal Oly...
2021-08-31
Code
27
COPRA + GPT-3.5
11.9
No
An In-Context Learning Agent for Formal Theorem-...
2023-10-06
Code
28
Sledgehammer
10.4
No
Thor: Wielding Hammers to Integrate Language Mod...
2022-05-22
-
29
Metamath GPT-f
1.6
No
MiniF2F: a cross-system benchmark for formal Oly...
2021-08-31
Code
#1
Kimina-Prover-Preview
SOTA
80.74
cumulative
· Extra Data
· 2025-04-15
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Code
#2
ProofAug
SOTA
66
cumulative
· 2025-01-30
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis
Code
#3
DeepSeek-Prover-V1.5
SOTA
63.5
cumulative
· Extra Data
· 2024-08-15
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Code
#4
Subgoal-XL
56.1
cumulative
· Extra Data
· 2024-08-20
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
Code
#5
DeepSeek-Prover
SOTA
52
cumulative
· Extra Data
· 2024-05-23
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
#6
Lyra + GPT-4
SOTA
47.1
cumulative
· 2023-09-27
Lyra: Orchestrating Dual Correction in Automated Theorem Proving
Code
#7
LEGO-Prover ChatGPT
47.1
cumulative
· 2023-10-01
LEGO-Prover: Neural Theorem Proving with Growing Libraries
Code
#8
Decomposing the Enigma
SOTA
45.5
cumulative
· 2023-05-25
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving
Code
#9
Evariste
SOTA
41
cumulative
· Extra Data
· 2022-05-23
HyperTree Proof Search for Neural Theorem Proving
#10
Evariste-7d
40.6
cumulative
· 2022-05-23
HyperTree Proof Search for Neural Theorem Proving
#11
Evariste-1d
38.9
cumulative
· 2022-05-23
HyperTree Proof Search for Neural Theorem Proving
#12
DSP (540B Minerva informal)
38.9
cumulative
· 2022-10-21
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Code
#13
Lean Expert Iteration
SOTA
36.6
cumulative
· Extra Data
· 2022-02-03
Formal Mathematics Statement Curriculum Learning
Code
#14
GPT-f
36.6
cumulative
· 2022-05-23
HyperTree Proof Search for Neural Theorem Proving
#15
Thor + expert iteration on autoformalised theorems
35.2
cumulative
· Extra Data
No paper
#16
COPRA + GPT-4-turbo
30.7
cumulative
· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving
Code
#17
Thor
29.9
cumulative
· 2022-05-22
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
#18
Lean GPT-f
SOTA
29.2
cumulative
· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Code
#19
MMOS-DeepSeekMath-7B
28.3
cumulative
· 2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning
Code
#20
ReProver
26.5
cumulative
No paper
#21
LLEMMA-7b
26.2
cumulative
· 2023-10-16
Llemma: An Open Language Model For Mathematics
Code
#22
LLEMMA-34b
25.8
cumulative
· 2023-10-16
Llemma: An Open Language Model For Mathematics
Code
#23
PACT (reproduced by Thor)
SOTA
24.6
cumulative
· 2021-02-11
Proof Artifact Co-training for Theorem Proving with Language Models
Code
#24
COPRA + GPT-4
23.3
cumulative
· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving
Code
#25
Sledgehammer + heuristics
20.9
cumulative
· 2022-10-21
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Code
#26
Lean tidy
18
cumulative
· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Code
#27
COPRA + GPT-3.5
11.9
cumulative
· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving
Code
#28
Sledgehammer
10.4
cumulative
· 2022-05-22
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
#29
Metamath GPT-f
1.6
cumulative
· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Code