TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SocialIQA: Commonsense Reasoning about Social Interactions

SocialIQA: Commonsense Reasoning about Social Interactions

Maarten Sap, Hannah Rashkin, Derek Chen, Ronan LeBras, Yejin Choi

2019-04-22Question AnsweringCoreference ResolutionCommon Sense ReasoningTransfer LearningMultiple-choice
PaperPDFCode

Abstract

We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: "Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?" A: "Make sure no one else could hear"). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).

Results

TaskDatasetMetricValueModel
Question AnsweringSIQAAccuracy64.5BERT-large 340M (fine-tuned)
Question AnsweringSIQAAccuracy63.1BERT-base 110M (fine-tuned)
Question AnsweringSIQAAccuracy63GPT-1 117M (fine-tuned)
Question AnsweringSIQAAccuracy33.3Random chance baseline
Question AnsweringCOPAAccuracy83.4BERT-SocialIQA 340M
Question AnsweringCOPAAccuracy80.8BERT-large 340M
Coreference ResolutionWinograd Schema ChallengeAccuracy72.5BERT-SocialIQA 340M
Coreference ResolutionWinograd Schema ChallengeAccuracy67BERT-large 340M

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17