Dou Hu, Yinan Bao, Lingwei Wei, Wei Zhou, Songlin Hu
Extracting generalized and robust representations is a major challenge in emotion recognition in conversations (ERC). To address this, we propose a supervised adversarial contrastive learning (SACL) framework for learning class-spread structured representations in a supervised manner. SACL applies contrast-aware adversarial training to generate worst-case samples and uses joint class-spread contrastive learning to extract structured representations. It can effectively utilize label-level feature consistency and retain fine-grained intra-class features. To avoid the negative impact of adversarial perturbations on context-dependent data, we design a contextual adversarial training (CAT) strategy to learn more diverse features from context and enhance the model's context robustness. Under the framework with CAT, we develop a sequence-based SACL-LSTM to learn label-consistent and context-robust features for ERC. Experiments on three datasets show that SACL-LSTM achieves state-of-the-art performance on ERC. Extended experiments prove the effectiveness of SACL and CAT.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Emotion Recognition | EmoryNLP | Micro-F1 | 43.19 | SACL-LSTM (one seed) |
| Emotion Recognition | EmoryNLP | Weighted-F1 | 40.47 | SACL-LSTM (one seed) |
| Emotion Recognition | EmoryNLP | Micro-F1 | 42.21 | SACL-LSTM |
| Emotion Recognition | EmoryNLP | Weighted-F1 | 39.65 | SACL-LSTM |
| Emotion Recognition | CMU-MOSEI-Sentiment | Accuracy | 38.6 | SACL-LSTM |
| Emotion Recognition | CMU-MOSEI-Sentiment | Weighted F1 | 25.95 | SACL-LSTM |
| Emotion Recognition | IEMOCAP-4 | Accuracy | 80.7 | SACL-LSTM |
| Emotion Recognition | IEMOCAP-4 | Weighted F1 | 80.74 | SACL-LSTM |
| Emotion Recognition | MELD | Accuracy | 67.89 | SACL-LSTM (one seed) |
| Emotion Recognition | MELD | Weighted-F1 | 66.86 | SACL-LSTM (one seed) |
| Emotion Recognition | MELD | Accuracy | 67.51 | SACL-LSTM |
| Emotion Recognition | MELD | Weighted-F1 | 66.45 | SACL-LSTM |
| Emotion Recognition | IEMOCAP | Accuracy | 69.62 | SACL-LSTM (one seed) |
| Emotion Recognition | IEMOCAP | Weighted-F1 | 69.7 | SACL-LSTM (one seed) |
| Emotion Recognition | IEMOCAP | Accuracy | 69.08 | SACL-LSTM |
| Emotion Recognition | IEMOCAP | Weighted-F1 | 69.22 | SACL-LSTM |