IDOL: Indicator-oriented Logic Pre-training for Logical Reasoning

Zihang Xu, Ziqing Yang, Yiming Cui, Shijin Wang

2023-06-27Reading Comprehension Logical Reasoning Machine Reading Comprehension

Abstract

In the field of machine reading comprehension (MRC), existing systems have surpassed the average performance of human beings in many tasks like SQuAD. However, there is still a long way to go when it comes to logical reasoning. Although some methods for it have been put forward, they either are designed in a quite complicated way or rely too much on external structures. In this paper, we proposed IDOL (InDicator-Oriented Logic Pre-training), an easy-to-understand but highly effective further pre-training task which logically strengthens the pre-trained models with the help of 6 types of logical indicators and a logically rich dataset LGP (LoGic Pre-training). IDOL achieves state-of-the-art performance on ReClor and LogiQA, the two most representative benchmarks in logical reasoning MRC, and is proven to be capable of generalizing to different pre-trained models and other types of MRC benchmarks like RACE and SQuAD 2.0 while keeping competitive general language understanding ability through testing on tasks in GLUE. Besides, at the beginning of the era of large language models, we take several of them like ChatGPT into comparison and find that IDOL still shows its advantage.

Results

Task	Dataset	Metric	Value	Model
Reading Comprehension	ReClor	Test	80.6	Rational Reasoner / IDOL

Related Papers

FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models2025-07-08 DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy2025-07-02 MiCo: Multi-image Contrast for Reinforcement Visual Reasoning2025-06-27 Chaining Event Spans for Temporal Relation Grounding2025-06-17 Discrete JEPA: Learning Discrete Token Representations without Reconstruction2025-06-17 SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models2025-06-15 CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making2025-06-15 TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving2025-06-12