Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen

2024-03-04Mathematical Reasoning Math Math Word Problem Solving GSM8K

Abstract

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets. Addressing this challenge, we propose Key-Point-Driven Data Synthesis (KPDDS), a novel data synthesis framework that synthesizes question-answer pairs by leveraging key points and exemplar practices from authentic data sources. KPDDS ensures the generation of novel questions with rigorous quality control and substantial scalability. As a result, we present KPMath, an extensive synthetic dataset tailored for mathematical reasoning, comprising over 800K question-answer pairs. Utilizing KPMath and augmenting it with additional reasoning-intensive corpora, we create the comprehensive KPMath-Plus dataset. The Qwen1.5-72B model, fine-tuned on KPMath-Plus, achieves 87.0% PASS@1 accuracy on GSM8K and 58.3% on MATH, surpassing competitors in the 7B to 70B range and best commercial models like GPT-4 across multiple math reasoning datasets.

Results

Task	Dataset	Metric	Value	Model
Question Answering	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Question Answering	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Question Answering	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Question Answering	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus

Abstract

Results

Task	Dataset	Metric	Value	Model
Question Answering	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Question Answering	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Question Answering	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Question Answering	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Question Answering	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Math Word Problem Solving	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Math Word Problem Solving	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Mathematical Question Answering	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Mathematical Question Answering	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	48.8	DeepSeekMath-7B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	7	DeepSeekMath-7B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	48.6	Llemma-34B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	34	Llemma-34B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	46.8	Mistral-7B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	7	Mistral-7B-KPMath-Plus
Mathematical Reasoning	MATH	Accuracy	41	Llama2-13B-KPMath-Plus
Mathematical Reasoning	MATH	Parameters (Billions)	13	Llama2-13B-KPMath-Plus

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Abstract

Results

Related Papers

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Abstract

Results

Related Papers