Learning to Prove Theorems via Interacting with Proof Assistants

Kaiyu Yang, Jia Deng

2019-05-21Mathematical Reasoning Automated Theorem Proving Mathematical Proofs

Abstract

Humans prove theorems by relying on substantial high-level reasoning and problem-specific insights. Proof assistants offer a formalism that resembles human mathematical reasoning, representing theorems in higher-order logic and proofs as high-level tactics. However, human experts have to construct proofs manually by entering tactics into the proof assistant. In this paper, we study the problem of using machine learning to automate the interaction with proof assistants. We construct CoqGym, a large-scale dataset and learning environment containing 71K human-written proofs from 123 projects developed with the Coq proof assistant. We develop ASTactic, a deep learning-based model that generates tactics as programs in the form of abstract syntax trees (ASTs). Experiments show that ASTactic trained on CoqGym can generate effective tactics and can be used to prove new theorems not previously provable by automated methods. Code is available at https://github.com/princeton-vl/CoqGym.

Results

Task	Dataset	Metric	Value	Model
Automated Theorem Proving	CoqGym	Percentage correct	12.2	ASTactic
Mathematical Proofs	CoqGym	Percentage correct	12.2	ASTactic

Related Papers

VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17 A Survey of Deep Learning for Geometry Problem Solving2025-07-16 KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?2025-07-15 Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination2025-07-14 A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning2025-07-11 Integrating External Tools with Large Language Models to Improve Accuracy2025-07-09 Skywork-R1V3 Technical Report2025-07-08 CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization2025-07-08