TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Decomposing the Enigma: Subgoal-based Demonstration Learni...

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

Xueliang Zhao, Wenda Li, Lingpeng Kong

2023-05-25Automated Theorem Proving
PaperPDFCode(official)

Abstract

Large language models~(LLMs) present an intriguing avenue of exploration in the domain of formal theorem proving. Nonetheless, the full utilization of these models, particularly in terms of demonstration formatting and organization, remains an underexplored area. In an endeavor to enhance the efficacy of LLMs, we introduce a subgoal-based demonstration learning framework, consisting of two primary elements: Firstly, drawing upon the insights of subgoal learning from the domains of reinforcement learning and robotics, we propose the construction of distinct subgoals for each demonstration example and refine these subgoals in accordance with the pertinent theories of subgoal learning. Secondly, we build upon recent advances in diffusion models to predict the optimal organization, simultaneously addressing two intricate issues that persist within the domain of demonstration organization: subset selection and order determination. Through the integration of subgoal-based learning methodologies, we have successfully increased the prevailing proof accuracy from 38.9\% to 44.3\% on the miniF2F benchmark. Furthermore, the adoption of diffusion models for demonstration organization can lead to an additional enhancement in accuracy to 45.5\%, or a $5\times$ improvement in sampling efficiency compared with the long-standing state-of-the-art method. Our code is available at \url{https://github.com/HKUNLP/subgoal-theorem-prover}.

Results

TaskDatasetMetricValueModel
Automated Theorem ProvingminiF2F-testPass@10045.5Decomposing the Enigma
Automated Theorem ProvingminiF2F-testcumulative45.5Decomposing the Enigma
Mathematical ProofsminiF2F-testPass@10045.5Decomposing the Enigma
Mathematical ProofsminiF2F-testcumulative45.5Decomposing the Enigma

Related Papers

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization2025-07-08Prover Agent: An Agent-based Framework for Formal Mathematical Proofs2025-06-24Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving2025-06-20MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?2025-06-06Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification2025-06-05LeanExplore: A search engine for Lean 4 declarations2025-06-04Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening2025-06-03ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction2025-05-30