TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Why We Feel: Breaking Boundaries in Emotional Reasoning wi...

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

Yuxiang Lin, Jingdong Sun, Zhi-Qi Cheng, Jue Wang, Haomin Liang, Zebang Cheng, Yifei Dong, Jun-Yan He, Xiaojiang Peng, Xian-Sheng Hua

2025-04-10Emotion InterpretationEmotion Recognition
PaperPDFCodeCode(official)

Abstract

Most existing emotion analysis emphasizes which emotion arises (e.g., happy, sad, angry) but neglects the deeper why. We propose Emotion Interpretation (EI), focusing on causal factors-whether explicit (e.g., observable objects, interpersonal interactions) or implicit (e.g., cultural context, off-screen events)-that drive emotional responses. Unlike traditional emotion recognition, EI tasks require reasoning about triggers instead of mere labeling. To facilitate EI research, we present EIBench, a large-scale benchmark encompassing 1,615 basic EI samples and 50 complex EI samples featuring multifaceted emotions. Each instance demands rationale-based explanations rather than straightforward categorization. We further propose a Coarse-to-Fine Self-Ask (CFSA) annotation pipeline, which guides Vision-Language Models (VLLMs) through iterative question-answer rounds to yield high-quality labels at scale. Extensive evaluations on open-source and proprietary large language models under four experimental settings reveal consistent performance gaps-especially for more intricate scenarios-underscoring EI's potential to enrich empathetic, context-aware AI applications. Our benchmark and methods are publicly available at: https://github.com/Lum1104/EIBench, offering a foundation for advanced multimodal causal analysis and next-generation affective computing.

Results

TaskDatasetMetricValueModel
Emotion InterpretationEIBench (complex)Recall39.27ChatGPT-4o
Emotion InterpretationEIBench (complex)Recall39.16LLaVA-NEXT (13B)
Emotion InterpretationEIBench (complex)Recall38.71LLaVA-NEXT (7B)
Emotion InterpretationEIBench (complex)Recall38.1LLaVA-1.5 (13B)
Emotion InterpretationEIBench (complex)Recall35.37LLaVA-NEXT (34B)
Emotion InterpretationEIBench (complex)Recall35.1MiniGPT-v2
Emotion InterpretationEIBench (complex)Recall30.9Video-LLaVA
Emotion InterpretationEIBench (complex)Recall28ChatGPT-4V
Emotion InterpretationEIBench (complex)Recall27.9Otter
Emotion InterpretationEIBench (complex)Recall24Claude-3-haiku
Emotion InterpretationEIBench (complex)Recall22Qwen-VL-Chat
Emotion InterpretationEIBench (complex)Recall21.37Claude-3-sonnet
Emotion InterpretationEIBench (complex)Recall20.37Qwen-vl-plus
Emotion InterpretationEIBenchRecall63.24Claude-3-haiku
Emotion InterpretationEIBenchRecall54.37LLaVA-1.5 (13B)
Emotion InterpretationEIBenchRecall54.33LLaVA-NEXT (13B)
Emotion InterpretationEIBenchRecall54.1Claude-3-sonnet
Emotion InterpretationEIBenchRecall53.82LLaVA-NEXT (7B)
Emotion InterpretationEIBenchRecall52.89MiniGPT-v2
Emotion InterpretationEIBenchRecall49.99ChatGPT-4o
Emotion InterpretationEIBenchRecall49.26Video-LLaVA
Emotion InterpretationEIBenchRecall49.03LLaVA-NEXT (34B)
Emotion InterpretationEIBenchRecall46.86ChatGPT-4V
Emotion InterpretationEIBenchRecall42.81Otter
Emotion InterpretationEIBenchRecall31Qwen-vl-plus
Emotion InterpretationEIBenchRecall26.45Qwen-VL-Chat

Related Papers

Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation2025-07-21Camera-based implicit mind reading by capturing higher-order semantic dynamics of human gaze within environmental context2025-07-17A Robust Incomplete Multimodal Low-Rank Adaptation Approach for Emotion Recognition2025-07-15Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation2025-07-11CAST-Phys: Contactless Affective States Through Physiological signals Database2025-07-08Exploring Remote Physiological Signal Measurement under Dynamic Lighting Conditions at Night: Dataset, Experiment, and Analysis2025-07-06How to Retrieve Examples in In-context Learning to Improve Conversational Emotion Recognition using Large Language Models?2025-06-25MATER: Multi-level Acoustic and Textual Emotion Representation for Interpretable Speech Emotion Recognition2025-06-24