TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Efficient Neural Theorem Proving via Fine-grained Proof St...

Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis

Haoxiong Liu, Jiacheng Sun, Zhenguo Li, Andrew C Yao

2025-01-30MathAutomated Theorem Proving
PaperPDFCode(official)

Abstract

The synergy between deep learning models and traditional automation tools plays a pivotal role in developing robust neural theorem provers (NTPs). However, for proof synthesis with LLMs, previous work applies automation tools either only when the model explicitly calls the method, or only at a single granularity level, failing to fully exploit the power of built-in tactics and off-the-shelf automated theorem provers. In this work, we propose ProofAug, a novel theorem proving method that enjoys superior sample efficiency through equipping proof-generation LLMs with automation methods in different granularities via fine-grained structure analysis of model-generated proof proposals. Furthermore, ProofAug serves as a versatile plug-and-play module that seamlessly integrates with any tree-search algorithm, enabling our construction of an efficient recursive proving (ERP) module to further enhance performance. The superiority of our method is validated on the miniF2F-test benchmark using the open-source deepseek-math-7b-base model and the Isabelle proof assistant. Notably, by additionally employing a mixed prompting strategy, we achieve a cumulative pass rate of 66.0% after curation of the dataset (61.9% for the original version), setting a new SOTA across all proof languages with a total sample budget of only 2100. Our code is available at https://github.com/haoxiongliu/ProofAug.

Results

TaskDatasetMetricValueModel
Automated Theorem ProvingminiF2F-testPass@136.5ProofAug
Automated Theorem ProvingminiF2F-testPass@10052.5ProofAug
Automated Theorem ProvingminiF2F-testcumulative66ProofAug
Mathematical ProofsminiF2F-testPass@136.5ProofAug
Mathematical ProofsminiF2F-testPass@10052.5ProofAug
Mathematical ProofsminiF2F-testcumulative66ProofAug

Related Papers

VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training2025-07-16Temperature and Persona Shape LLM Agent Consensus With Minimal Accuracy Gains in Qualitative Coding2025-07-15Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing2025-07-15Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination2025-07-14A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning2025-07-11Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs2025-07-10