Proof Artifact Co-training for Theorem Proving with Language Models

Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu

2021-02-11ICLR 2022 4Automated Theorem Proving Imitation Learning Language Modelling

Paper PDF Code(official)Code(official)Code(official)Code

Abstract

Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-scarce, easily-overfitted regime. We propose PACT ({\bf P}roof {\bf A}rtifact {\bf C}o-{\bf T}raining), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective. We apply this methodology to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32\% to 48\%.

Results

Task	Dataset	Metric	Value	Model
Automated Theorem Proving	miniF2F-test	Pass@1	24.6	PACT (reproduced by Thor)
Automated Theorem Proving	miniF2F-test	cumulative	24.6	PACT (reproduced by Thor)
Mathematical Proofs	miniF2F-test	Pass@1	24.6	PACT (reproduced by Thor)
Mathematical Proofs	miniF2F-test	cumulative	24.6	PACT (reproduced by Thor)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner2025-07-17 Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)2025-07-17 Making Language Model a Hierarchical Classifier and Generator2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Assay2Mol: large language model-based drug design using BioAssay context2025-07-16