An Extended Variational Mode Decomposition Algorithm Developed Speech Emotion Recognition Performance

David Hason Rudd, Huan Huo, Guandong Xu

2023-12-18Pacific-Asia Conference on Knowledge Discovery and Data Mining 2023 5Sentiment Analysis Speech Emotion Recognition Emotion Recognition

Paper PDF Code

Abstract

Emotion recognition (ER) from speech signals is a robust approach since it cannot be imitated like facial expression or text based sentiment analysis. Valuable information underlying the emotions are significant for human-computer interactions enabling intelligent machines to interact with sensitivity in the real world. Previous ER studies through speech signal processing have focused exclusively on associations between different signal mode decomposition methods and hidden informative features. However, improper decomposition parameter selections lead to informative signal component losses due to mode duplicating and mixing. In contrast, the current study proposes VGG-optiVMD, an empowered variational mode decomposition algorithm, to distinguish meaningful speech features and automatically select the number of decomposed modes and optimum balancing parameter for the data fidelity constraint by assessing their effects on the VGG16 flattening output layer. Various feature vectors were employed to train the VGG16 network on different databases and assess VGG-optiVMD reproducibility and reliability. One, two, and three-dimensional feature vectors were constructed by concatenating Mel-frequency cepstral coefficients, Chromagram, Mel spectrograms, Tonnetz diagrams, and spectral centroids. Results confirmed a synergistic relationship between the fine-tuning of the signal sample rate and decomposition parameters with classification accuracy, achieving state-of-the-art 96.09% accuracy in predicting seven emotions on the Berlin EMO-DB database.

Results

Task	Dataset	Metric	Value	Model
Emotion Recognition	EMODB	1:1 Accuracy	96.09	VGG-optiVMD
Speech Emotion Recognition	EMODB	1:1 Accuracy	96.09	VGG-optiVMD

Related Papers

Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation2025-07-21 AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17 Camera-based implicit mind reading by capturing higher-order semantic dynamics of human gaze within environmental context2025-07-17 AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15 DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15 A Robust Incomplete Multimodal Low-Rank Adaptation Approach for Emotion Recognition2025-07-15 SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14 Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation2025-07-11