Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

2022-05-22Automated Theorem Proving

Paper PDF

Abstract

In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. In Thor, a class of methods called hammers that leverage the power of automated theorem provers are used for premise selection, while all other tasks are designated to language models. Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8.2\%$ of problems neither language models nor automated theorem provers are able to solve on their own. Furthermore, with a significantly smaller computational budget, Thor can achieve a success rate on the MiniF2F dataset that is on par with the best existing methods. Thor can be instantiated for the majority of popular interactive theorem provers via a straightforward protocol we provide.

Results

Task	Dataset	Metric	Value	Model
Automated Theorem Proving	miniF2F-test	Pass@1	29.9	Thor
Automated Theorem Proving	miniF2F-test	cumulative	29.9	Thor
Automated Theorem Proving	miniF2F-test	Pass@1	10.4	Sledgehammer
Automated Theorem Proving	miniF2F-test	cumulative	10.4	Sledgehammer
Mathematical Proofs	miniF2F-test	Pass@1	29.9	Thor
Mathematical Proofs	miniF2F-test	cumulative	29.9	Thor
Mathematical Proofs	miniF2F-test	Pass@1	10.4	Sledgehammer
Mathematical Proofs	miniF2F-test	cumulative	10.4	Sledgehammer

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Abstract

Results

Related Papers

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Abstract

Results

Related Papers