Solving Math Word Problems via Cooperative Reasoning induced Language Models

Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang

2022-10-28Math Arithmetic Reasoning

Abstract

Large-scale pre-trained language models (PLMs) bring new opportunities to challenging problems, especially those that need high-level intelligence, such as the math word problem (MWPs). However, directly applying existing PLMs to MWPs can fail as the generation process lacks sufficient supervision and thus lacks fast adaptivity as humans. We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. In our approach, the generator is responsible for generating reasoning paths, and the verifiers are used to supervise the evaluation in order to obtain reliable feedback for the generator. We evaluate our CoRe framework on several mathematical reasoning datasets and achieve decent improvement over state-of-the-art methods, up to 9.6% increase over best baselines. Our codes are available at https://github.com/TianHongZXY/CoRe

Results

Task	Dataset	Metric	Value	Model
Arithmetic Reasoning	GSM8K	Accuracy	63.2	GPT-J (CoRe)
Arithmetic Reasoning	GSM8K	Parameters (Billion)	12	GPT-J (CoRe)

Related Papers

VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17 QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17 Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training2025-07-16 Temperature and Persona Shape LLM Agent Consensus With Minimal Accuracy Gains in Qualitative Coding2025-07-15 Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing2025-07-15 DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15 Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination2025-07-14 A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning2025-07-11