TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Broadcasted Residual Learning for Efficient Keyword Spotting

Broadcasted Residual Learning for Efficient Keyword Spotting

Byeonggeun Kim, Simyung Chang, Jinkyu Lee, Dooyong Sung

2021-06-08Keyword Spotting
PaperPDFCode(official)CodeCodeCode

Abstract

Keyword spotting is an important research field because it plays a key role in device wake-up and user interaction on smart devices. However, it is challenging to minimize errors while operating efficiently in devices with limited resources such as mobile phones. We present a broadcasted residual learning method to achieve high accuracy with small model size and computational load. Our method configures most of the residual functions as 1D temporal convolution while still allows 2D convolution together using a broadcasted-residual connection that expands temporal output to frequency-temporal dimension. This residual mapping enables the network to effectively represent useful audio features with much less computation than conventional convolutional neural networks. We also propose a novel network architecture, Broadcasting-residual network (BC-ResNet), based on broadcasted residual learning and describe how to scale up the model according to the target device's resources. BC-ResNets achieve state-of-the-art 98.0% and 98.7% top-1 accuracy on Google speech command datasets v1 and v2, respectively, and consistently outperform previous approaches, using fewer computations and parameters. Code is available at https://github.com/Qualcomm-AI-research/bcresnet.

Results

TaskDatasetMetricValueModel
Keyword SpottingGoogle Speech CommandsGoogle Speech Commands V1 1298BC-ResNet-8
Keyword SpottingGoogle Speech CommandsGoogle Speech Commands V2 1298.7BC-ResNet-8

Related Papers

Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models2025-06-21Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21ASAP-FE: Energy-Efficient Feature Extraction Enabling Multi-Channel Keyword Spotting on Edge Processors2025-06-17GLAP: General contrastive audio-text pretraining across domains and languages2025-06-12Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and Algorithms2025-06-12SPBA: Utilizing Speech Large Language Model for Backdoor Attacks on Speech Classification Models2025-06-10Implementing Keyword Spotting on the MCUX947 Microcontroller with Integrated NPU2025-06-10Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting2025-06-06