Progressive-Hint Prompting Improves Reasoning in Large Language Models

Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li

2023-04-19Math Math Word Problem Solving GSM8K Arithmetic Reasoning

Abstract

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (89.1% -> 91.9%), GSM8K (92% -> 95.5%), AQuA (76.4% -> 79.9%) and MATH (50.3% -> 53.9%).

Results

Task	Dataset	Metric	Value	Model
Question Answering	MATH	Accuracy	53.9	PHP (GPT-4 model)
Question Answering	SVAMP	Execution Accuracy	91.9	GPT-4 (PHP)
Math Word Problem Solving	MATH	Accuracy	53.9	PHP (GPT-4 model)
Math Word Problem Solving	SVAMP	Execution Accuracy	91.9	GPT-4 (PHP)
Mathematical Question Answering	MATH	Accuracy	53.9	PHP (GPT-4 model)
Mathematical Question Answering	SVAMP	Execution Accuracy	91.9	GPT-4 (PHP)
Mathematical Reasoning	MATH	Accuracy	53.9	PHP (GPT-4 model)
Mathematical Reasoning	SVAMP	Execution Accuracy	91.9	GPT-4 (PHP)

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Abstract

Results

Related Papers

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Abstract

Results

Related Papers