Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures
Karn Watcharasupat, Anh H. T. Nguyen, Ching-Hui Ooi, Andy W. H. Khong
Abstract
In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix. We propose an algorithm based on the directional sparse filtering (DSF) framework that utilizes the Lehmer mean with learnable weights to adaptively account for source imbalance. Performance evaluation in multiple real acoustic environments show improvements in source separation compared to the baseline methods.
Related Papers
Towards Reliable Objective Evaluation Metrics for Generative Singing Voice Separation Models2025-07-15Dynamic Slimmable Networks for Efficient Speech Separation2025-07-08Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios2025-06-17DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization2025-06-03ZeroSep: Separate Anything in Audio with Zero Training2025-05-29Text-Queried Audio Source Separation via Hierarchical Modeling2025-05-27Training-Free Multi-Step Audio Source Separation2025-05-26SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline2025-05-25