TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HolStep: A Machine Learning Dataset for Higher-order Logic...

HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

Cezary Kaliszyk, François Chollet, Christian Szegedy

2017-03-01Automated Theorem ProvingBIG-bench Machine Learning
PaperPDFCode

Abstract

Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies. We make this dataset publicly available under the BSD license. We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving. We also benchmark a set of simple baseline machine learning models suited for the tasks (including logistic regression, convolutional neural networks and recurrent neural networks). The results of our baseline models show the promise of applying machine learning to HOL theorem proving.

Results

TaskDatasetMetricValueModel
Automated Theorem ProvingHolStep (Unconditional)Classification Accuracy0.831D CNN
Automated Theorem ProvingHolStep (Unconditional)Classification Accuracy0.831D CNN-LSTM
Automated Theorem ProvingHolStep (Conditional)Classification Accuracy0.83Siamese 1D CNN-LSTM
Automated Theorem ProvingHolStep (Conditional)Classification Accuracy0.82Siamese 1D CNN
Mathematical ProofsHolStep (Unconditional)Classification Accuracy0.831D CNN
Mathematical ProofsHolStep (Unconditional)Classification Accuracy0.831D CNN-LSTM
Mathematical ProofsHolStep (Conditional)Classification Accuracy0.83Siamese 1D CNN-LSTM
Mathematical ProofsHolStep (Conditional)Classification Accuracy0.82Siamese 1D CNN

Related Papers

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization2025-07-08Prover Agent: An Agent-based Framework for Formal Mathematical Proofs2025-06-24Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving2025-06-20MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?2025-06-06Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification2025-06-05LeanExplore: A search engine for Lean 4 declarations2025-06-04Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening2025-06-03ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction2025-05-30