Attention-Free Keyword Spotting

Mashrur M. Morshed, Ahmad Omar Ahsan

2021-10-14Keyword Spotting

Abstract

Till now, attention-based models have been used with great success in the keyword spotting problem domain. However, in light of recent advances in deep learning, the question arises whether self-attention is truly irreplaceable for recognizing speech keywords. We thus explore the usage of gated MLPs --previously shown to be alternatives to transformers in vision tasks-- for the keyword spotting task. We provide a family of highly efficient MLP-based models for keyword spotting, with less than 0.5 million parameters. We show that our approach achieves competitive performance on Google Speech Commands V2-12 and V2-35 benchmarks with much fewer parameters than self-attention-based methods.

Results

Task	Dataset	Metric	Value	Model
Keyword Spotting	Google Speech Commands	Google Speech Commands V2 35	97.56	KW-MLP

Related Papers

Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models2025-06-21 Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21 ASAP-FE: Energy-Efficient Feature Extraction Enabling Multi-Channel Keyword Spotting on Edge Processors2025-06-17 GLAP: General contrastive audio-text pretraining across domains and languages2025-06-12 Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and Algorithms2025-06-12 SPBA: Utilizing Speech Large Language Model for Backdoor Attacks on Speech Classification Models2025-06-10 Implementing Keyword Spotting on the MCUX947 Microcontroller with Integrated NPU2025-06-10 Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting2025-06-06