HOList: An Environment for Machine Learning of Higher-Order Theorem Proving

Kshitij Bansal, Sarah M. Loos, Markus N. Rabe, Christian Szegedy, Stewart Wilcox

2019-04-05Reinforcement Learning Automated Theorem Proving Deep Learning BIG-bench Machine Learning reinforcement-learning

Paper PDF Code Code Code

Abstract

We present an environment, benchmark, and deep learning driven automated theorem prover for higher-order logic. Higher-order interactive theorem provers enable the formalization of arbitrary mathematical theories and thereby present an interesting, open-ended challenge for deep learning. We provide an open-source framework based on the HOL Light theorem prover that can be used as a reinforcement learning environment. HOL Light comes with a broad coverage of basic mathematical theorems on calculus and the formal proof of the Kepler conjecture, from which we derive a challenging benchmark for automated reasoning. We also present a deep reinforcement learning driven automated theorem prover, DeepHOL, with strong initial results on this benchmark.

Results

Task	Dataset	Metric	Value	Model
Automated Theorem Proving	HOList benchmark	Percentage correct	38.88	Tactic Dependent Loop
Automated Theorem Proving	HOList benchmark	Percentage correct	32.65	Deeper Wider WaveNet
Mathematical Proofs	HOList benchmark	Percentage correct	38.88	Tactic Dependent Loop
Mathematical Proofs	HOList benchmark	Percentage correct	32.65	Deeper Wider WaveNet

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18 Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17 Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17 VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17 QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17