TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/End-to-End Audio Strikes Back: Boosting Augmentations Towa...

End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network

Avi Gazneli, Gadi Zimerman, Tal Ridnik, Gilad Sharir, Asaf Noy

2022-04-25Keyword SpottingEnvironmental Sound ClassificationSound ClassificationAudio ClassificationClassification
PaperPDFCode(official)

Abstract

While efficient architectures and a plethora of augmentations for end-to-end image classification tasks have been suggested and heavily investigated, state-of-the-art techniques for audio classifications still rely on numerous representations of the audio signal together with large architectures, fine-tuned from large datasets. By utilizing the inherited lightweight nature of audio and novel audio augmentations, we were able to present an efficient end-to-end network with strong generalization ability. Experiments on a variety of sound classification sets demonstrate the effectiveness and robustness of our approach, by achieving state-of-the-art results in various settings. Public code is available at: \href{https://github.com/Alibaba-MIIL/AudioClassfication}{this http url}

Results

TaskDatasetMetricValueModel
Keyword SpottingGoogle Speech CommandsGoogle Speech Commands V2 3598.15EAT-S
Audio ClassificationESC-50Accuracy (5-fold)96.3EAT-M
Audio ClassificationESC-50Top-1 Accuracy96.3EAT-M
Audio ClassificationESC-50Accuracy (5-fold)95.25EAT-S
Audio ClassificationESC-50Top-1 Accuracy95.25EAT-S
Audio ClassificationESC-50Accuracy (5-fold)92.15EAT-S (scratch)
Audio ClassificationESC-50Top-1 Accuracy92.15EAT-S (scratch)
Audio ClassificationAudioSetTest mAP0.426EAT-M
Audio ClassificationAudioSetTest mAP0.405EAT-S
ClassificationESC-50Accuracy (5-fold)96.3EAT-M
ClassificationESC-50Top-1 Accuracy96.3EAT-M
ClassificationESC-50Accuracy (5-fold)95.25EAT-S
ClassificationESC-50Top-1 Accuracy95.25EAT-S
ClassificationESC-50Accuracy (5-fold)92.15EAT-S (scratch)
ClassificationESC-50Top-1 Accuracy92.15EAT-S (scratch)
ClassificationAudioSetTest mAP0.426EAT-M
ClassificationAudioSetTest mAP0.405EAT-S

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06Hybrid-View Attention for csPCa Classification in TRUS2025-07-04