Automated Theorem Proving on miniF2F-test

Metric: Pass@1 (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	Pass@1▼	Extra Data	Paper	Date↕	Code
1	Kimina-Prover-Preview	52.94	Yes	Kimina-Prover Preview: Towards Large Formal Reas...	2025-04-15	Code
2	ProofAug	36.5	No	Efficient Neural Theorem Proving via Fine-graine...	2025-01-30	Code
3	Thor + expert iteration on autoformalised theorems	35.2	Yes	-	-	-
4	COPRA + GPT-4-turbo	30.7	No	An In-Context Learning Agent for Formal Theorem-...	2023-10-06	Code
5	DeepSeek-Prover	30	Yes	DeepSeek-Prover: Advancing Theorem Proving in LL...	2024-05-23	-
6	Thor	29.9	No	Thor: Wielding Hammers to Integrate Language Mod...	2022-05-22	-
7	Lean Expert Iteration	29.6	Yes	Formal Mathematics Statement Curriculum Learning	2022-02-03	Code
8	MMOS-DeepSeekMath-7B	28.3	No	An Empirical Study of Data Ability Boundary in L...	2024-02-23	Code
9	Lean GPT-f	24.6	No	MiniF2F: a cross-system benchmark for formal Oly...	2021-08-31	Code
10	PACT (reproduced by Thor)	24.6	No	Proof Artifact Co-training for Theorem Proving w...	2021-02-11	Code
11	COPRA + GPT-4	23.3	No	An In-Context Learning Agent for Formal Theorem-...	2023-10-06	Code
12	Sledgehammer + heuristics	20.9	No	Draft, Sketch, and Prove: Guiding Formal Theorem...	2022-10-21	Code
13	Lean tidy	18	No	MiniF2F: a cross-system benchmark for formal Oly...	2021-08-31	Code
14	COPRA + GPT-3.5	11.9	No	An In-Context Learning Agent for Formal Theorem-...	2023-10-06	Code
15	Sledgehammer	10.4	No	Thor: Wielding Hammers to Integrate Language Mod...	2022-05-22	-
16	Metamath GPT-f	1.3	No	MiniF2F: a cross-system benchmark for formal Oly...	2021-08-31	Code

#1Kimina-Prover-PreviewSOTA
52.94
Pass@1· Extra Data· 2025-04-15
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Code
#2ProofAugSOTA
36.5
Pass@1· 2025-01-30
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis Code
#3Thor + expert iteration on autoformalised theorems
35.2
Pass@1· Extra Data
No paper
#4COPRA + GPT-4-turboSOTA
30.7
Pass@1· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving Code
#5DeepSeek-Prover
30
Pass@1· Extra Data· 2024-05-23
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
#6ThorSOTA
29.9
Pass@1· 2022-05-22
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
#7Lean Expert IterationSOTA
29.6
Pass@1· Extra Data· 2022-02-03
Formal Mathematics Statement Curriculum Learning Code
#8MMOS-DeepSeekMath-7B
28.3
Pass@1· 2024-02-23
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning Code
#9Lean GPT-f
24.6
Pass@1· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics Code
#10PACT (reproduced by Thor)SOTA
24.6
Pass@1· 2021-02-11
Proof Artifact Co-training for Theorem Proving with Language Models Code
#11COPRA + GPT-4
23.3
Pass@1· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving Code
#12Sledgehammer + heuristics
20.9
Pass@1· 2022-10-21
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs Code
#13Lean tidy
18
Pass@1· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics Code
#14COPRA + GPT-3.5
11.9
Pass@1· 2023-10-06
An In-Context Learning Agent for Formal Theorem-Proving Code
#15Sledgehammer
10.4
Pass@1· 2022-05-22
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
#16Metamath GPT-f
1.3
Pass@1· 2021-08-31
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics Code