CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues

Deepanway Ghosal, Siqi Shen, Navonil Majumder, Rada Mihalcea, Soujanya Poria

2022-03-25ACL 2022 5Answer Selection Answer Generation

Abstract

This paper addresses the problem of dialogue reasoning with contextualized commonsense inference. We curate CICERO, a dataset of dyadic conversations with five types of utterance-level reasoning-based inferences: cause, subsequent event, prerequisite, motivation, and emotional reaction. The dataset contains 53,105 of such inferences from 5,672 dialogues. We use this dataset to solve relevant generative and discriminative tasks: generation of cause and subsequent event; generation of prerequisite, motivation, and listener's emotional reaction; and selection of plausible alternatives. Our results ascertain the value of such dialogue-centric commonsense knowledge datasets. It is our hope that CICERO will open new research avenues into commonsense-based dialogue reasoning.

Results

Task	Dataset	Metric	Value	Model
Question Answering	CICERO	Exact Match	77.68	T5-large
Question Answering	CICERO	Exact Match	77.51	Unified QA
Question Answering	CICERO	ROUGE	0.298	T5-large pre-trained on GLUCOSE
Question Answering	CICERO	ROUGE	0.2946	T5-large
Question Answering	CICERO	ROUGE	0.2878	T5-large pre-trained on COMET
Question Answering	CICERO	ROUGE	0.2837	BART
Natural Language Inference	CICERO	ROUGE	0.298	T5-large pre-trained on GLUCOSE
Natural Language Inference	CICERO	ROUGE	0.2947	T5-large

Related Papers

Small Encoders Can Rival Large Decoders in Detecting Groundedness2025-06-26 GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning2025-06-22 RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge2025-06-17 RAGtifier: Evaluating RAG Generation Approaches of State-of-the-Art RAG Systems for the SIGIR LiveRAG Competition2025-06-17 FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design2025-06-16 CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making2025-06-15 TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning2025-06-12 Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering2025-06-12