Multi-Task Learning with Auxiliary Speaker Identification for Conversational Emotion Recognition
Jingye Li, Meishan Zhang, Donghong Ji, Yijiang Liu
Abstract
Conversational emotion recognition (CER) has attracted increasing interests in the natural language processing (NLP) community. Different from the vanilla emotion recognition, effective speaker-sensitive utterance representation is one major challenge for CER. In this paper, we exploit speaker identification (SI) as an auxiliary task to enhance the utterance representation in conversations. By this method, we can learn better speaker-aware contextual representations from the additional SI corpus. Experiments on two benchmark datasets demonstrate that the proposed architecture is highly effective for CER, obtaining new state-of-the-art results on two datasets.
Results
Related Papers
Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation2025-07-21SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17Robust-Multi-Task Gradient Boosting2025-07-15Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation2025-07-11SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation2025-07-10Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration2025-06-25AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation2025-06-24An Audio-centric Multi-task Learning Framework for Streaming Ads Targeting on Spotify2025-06-23