Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

2022-10-21Automated Theorem Proving Mathematical Proofs Language Modelling

Paper PDF Code Code(official)Code(official)

Abstract

The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems. We investigate two relevant setups where informal proofs are either written by humans or generated by a language model. Our experiments and ablation studies show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs. Guiding an automated prover with these sketches enhances its performance from 20.9% to 39.3% on a collection of mathematical competition problems.

Results

Task	Dataset	Metric	Value	Model
Automated Theorem Proving	miniF2F-valid	Pass@100	43.9	DSP (62B Minerva informal)
Automated Theorem Proving	miniF2F-test	Pass@100	38.9	DSP (540B Minerva informal)
Automated Theorem Proving	miniF2F-test	cumulative	38.9	DSP (540B Minerva informal)
Automated Theorem Proving	miniF2F-test	Pass@1	20.9	Sledgehammer + heuristics
Automated Theorem Proving	miniF2F-test	cumulative	20.9	Sledgehammer + heuristics
Mathematical Proofs	miniF2F-valid	Pass@100	43.9	DSP (62B Minerva informal)
Mathematical Proofs	miniF2F-test	Pass@100	38.9	DSP (540B Minerva informal)
Mathematical Proofs	miniF2F-test	cumulative	38.9	DSP (540B Minerva informal)
Mathematical Proofs	miniF2F-test	Pass@1	20.9	Sledgehammer + heuristics
Mathematical Proofs	miniF2F-test	cumulative	20.9	Sledgehammer + heuristics

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

Abstract

Results

Related Papers

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

Abstract

Results

Related Papers