TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Proof Artifact Co-training for Theorem Proving with Langua...

Proof Artifact Co-training for Theorem Proving with Language Models

Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu

2021-02-11ICLR 2022 4Automated Theorem ProvingImitation LearningLanguage Modelling
PaperPDFCode(official)Code(official)Code(official)Code

Abstract

Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-scarce, easily-overfitted regime. We propose PACT ({\bf P}roof {\bf A}rtifact {\bf C}o-{\bf T}raining), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective. We apply this methodology to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32\% to 48\%.

Results

TaskDatasetMetricValueModel
Automated Theorem ProvingminiF2F-testPass@124.6PACT (reproduced by Thor)
Automated Theorem ProvingminiF2F-testcumulative24.6PACT (reproduced by Thor)
Mathematical ProofsminiF2F-testPass@124.6PACT (reproduced by Thor)
Mathematical ProofsminiF2F-testcumulative24.6PACT (reproduced by Thor)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner2025-07-17Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16