TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Two is Better than Many? Binary Classification as an Effec...

Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering

Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

2022-10-29Question AnsweringSentence CompletionBinary ClassificationScience Question Answering
PaperPDFCode(official)

Abstract

We propose a simple refactoring of multi-choice question answering (MCQA) tasks as a series of binary classifications. The MCQA task is generally performed by scoring each (question, answer) pair normalized over all the pairs, and then selecting the answer from the pair that yield the highest score. For n answer choices, this is equivalent to an n-class classification setup where only one class (true answer) is correct. We instead show that classifying (question, true answer) as positive instances and (question, false answer) as negative instances is significantly more effective across various models and datasets. We show the efficacy of our proposed approach in different tasks -- abductive reasoning, commonsense question answering, science question answering, and sentence completion. Our DeBERTa binary classification model reaches the top or close to the top performance on public leaderboards for these tasks. The source code of the proposed approach is available at https://github.com/declare-lab/TEAM.

Results

TaskDatasetMetricValueModel
Question AnsweringSIQAAccuracy80.2DeBERTa-Large 304M
Question AnsweringSIQAAccuracy79.9DeBERTa-Large 304M (classification-based)
Question AnsweringPIQAAccuracy87.4DeBERTa-Large 304M
Question AnsweringPIQAAccuracy85.9DeBERTa-Large 304M (classification-based)
Sentence CompletionHellaSwagAccuracy95.6DeBERTa-Large 304M (classification-based)
Sentence CompletionHellaSwagAccuracy94.7DeBERTa-Large 304M

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16Warehouse Spatial Question Answering with LLM Agent2025-07-14An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework2025-07-10