Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering

Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

2022-10-29Question Answering Sentence Completion Binary Classification Science Question Answering

Abstract

We propose a simple refactoring of multi-choice question answering (MCQA) tasks as a series of binary classifications. The MCQA task is generally performed by scoring each (question, answer) pair normalized over all the pairs, and then selecting the answer from the pair that yield the highest score. For n answer choices, this is equivalent to an n-class classification setup where only one class (true answer) is correct. We instead show that classifying (question, true answer) as positive instances and (question, false answer) as negative instances is significantly more effective across various models and datasets. We show the efficacy of our proposed approach in different tasks -- abductive reasoning, commonsense question answering, science question answering, and sentence completion. Our DeBERTa binary classification model reaches the top or close to the top performance on public leaderboards for these tasks. The source code of the proposed approach is available at https://github.com/declare-lab/TEAM.

Results

Task	Dataset	Metric	Value	Model
Question Answering	SIQA	Accuracy	80.2	DeBERTa-Large 304M
Question Answering	SIQA	Accuracy	79.9	DeBERTa-Large 304M (classification-based)
Question Answering	PIQA	Accuracy	87.4	DeBERTa-Large 304M
Question Answering	PIQA	Accuracy	85.9	DeBERTa-Large 304M (classification-based)
Sentence Completion	HellaSwag	Accuracy	95.6	DeBERTa-Large 304M (classification-based)
Sentence Completion	HellaSwag	Accuracy	94.7	DeBERTa-Large 304M

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17 Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17 Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17 City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17 Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16 Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16 Warehouse Spatial Question Answering with LLM Agent2025-07-14 An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework2025-07-10