Topological Deep Learning for Speech Data

Zhiwang Yu

2025-05-27Speech Recognition speech-recognition Phoneme Recognition Topological Data Analysis Deep Learning

Abstract

Topological data analysis (TDA) offers novel mathematical tools for deep learning. Inspired by Carlsson et al., this study designs topology-aware convolutional kernels that significantly improve speech recognition networks. Theoretically, by investigating orthogonal group actions on kernels, we establish a fiber-bundle decomposition of matrix spaces, enabling new filter generation methods. Practically, our proposed Orthogonal Feature (OF) layer achieves superior performance in phoneme recognition, particularly in low-noise scenarios, while demonstrating cross-domain adaptability. This work reveals TDA's potential in neural network optimization, opening new avenues for mathematics-deep learning interdisciplinary studies.

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18 Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17 NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025-07-17 A Survey of Deep Learning for Geometry Problem Solving2025-07-16 WhisperKit: On-device Real-time ASR with Billion-Scale Transformers2025-07-14 Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning2025-07-10 VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis2025-07-08 Chat-Ghosting: A Comparative Study of Methods for Auto-Completion in Dialog Systems2025-07-08