TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HyperTree Proof Search for Neural Theorem Proving

HyperTree Proof Search for Neural Theorem Proving

Guillaume Lample, Marie-Anne Lachaux, Thibaut Lavril, Xavier Martinet, Amaury Hayat, Gabriel Ebner, Aurélien Rodriguez, Timothée Lacroix

2022-05-23Automated Theorem Proving
PaperPDF

Abstract

We propose an online training procedure for a transformer-based automated theorem prover. Our approach leverages a new search algorithm, HyperTree Proof Search (HTPS), inspired by the recent success of AlphaZero. Our model learns from previous proof searches through online training, allowing it to generalize to domains far from the training distribution. We report detailed ablations of our pipeline's main components by studying performance on three environments of increasing complexity. In particular, we show that with HTPS alone, a model trained on annotated proofs manages to prove 65.4% of a held-out set of Metamath theorems, significantly outperforming the previous state of the art of 56.5% by GPT-f. Online training on these unproved theorems increases accuracy to 82.6%. With a similar computational budget, we improve the state of the art on the Lean-based miniF2F-curriculum dataset from 31% to 42% proving accuracy.

Results

TaskDatasetMetricValueModel
Automated Theorem ProvingMetamath set.mmPass@3272.4Evariste
Automated Theorem ProvingminiF2F-validPass@6458.6Evariste
Automated Theorem ProvingminiF2F-validPass@6447.5Evariste-7d
Automated Theorem ProvingminiF2F-validPass@6447.3GPT-f
Automated Theorem ProvingminiF2F-validPass@6446.7Evariste-1d
Automated Theorem ProvingminiF2F-curriculumPass@6442.5Evariste-7d
Automated Theorem ProvingminiF2F-curriculumPass@6433.6Evariste-1d
Automated Theorem ProvingminiF2F-curriculumPass@6432.1Evariste
Automated Theorem ProvingminiF2F-curriculumPass@6430.6GPT-f
Automated Theorem ProvingminiF2F-testPass@6441Evariste
Automated Theorem ProvingminiF2F-testcumulative41Evariste
Automated Theorem ProvingminiF2F-testPass@6440.6Evariste-7d
Automated Theorem ProvingminiF2F-testcumulative40.6Evariste-7d
Automated Theorem ProvingminiF2F-testPass@6438.9Evariste-1d
Automated Theorem ProvingminiF2F-testcumulative38.9Evariste-1d
Automated Theorem ProvingminiF2F-testPass@6436.6GPT-f
Automated Theorem ProvingminiF2F-testcumulative36.6GPT-f
Mathematical ProofsMetamath set.mmPass@3272.4Evariste
Mathematical ProofsminiF2F-validPass@6458.6Evariste
Mathematical ProofsminiF2F-validPass@6447.5Evariste-7d
Mathematical ProofsminiF2F-validPass@6447.3GPT-f
Mathematical ProofsminiF2F-validPass@6446.7Evariste-1d
Mathematical ProofsminiF2F-curriculumPass@6442.5Evariste-7d
Mathematical ProofsminiF2F-curriculumPass@6433.6Evariste-1d
Mathematical ProofsminiF2F-curriculumPass@6432.1Evariste
Mathematical ProofsminiF2F-curriculumPass@6430.6GPT-f
Mathematical ProofsminiF2F-testPass@6441Evariste
Mathematical ProofsminiF2F-testcumulative41Evariste
Mathematical ProofsminiF2F-testPass@6440.6Evariste-7d
Mathematical ProofsminiF2F-testcumulative40.6Evariste-7d
Mathematical ProofsminiF2F-testPass@6438.9Evariste-1d
Mathematical ProofsminiF2F-testcumulative38.9Evariste-1d
Mathematical ProofsminiF2F-testPass@6436.6GPT-f
Mathematical ProofsminiF2F-testcumulative36.6GPT-f

Related Papers

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization2025-07-08Prover Agent: An Agent-based Framework for Formal Mathematical Proofs2025-06-24Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving2025-06-20MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?2025-06-06Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification2025-06-05LeanExplore: A search engine for Lean 4 declarations2025-06-04Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening2025-06-03ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction2025-05-30