TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Slow-Fast Auditory Streams For Audio Recognition

Slow-Fast Auditory Streams For Audio Recognition

Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen

2021-03-05Audio ClassificationHuman Interaction Recognition
PaperPDFCodeCode(official)

Abstract

We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs. Following similar success in visual recognition, we learn Slow-Fast auditory streams with separable convolutions and multi-level lateral connections. The Slow pathway has high channel capacity while the Fast pathway operates at a fine-grained temporal resolution. We showcase the importance of our two-stream proposal on two diverse datasets: VGG-Sound and EPIC-KITCHENS-100, and achieve state-of-the-art results on both.

Results

TaskDatasetMetricValueModel
Human Interaction RecognitionEPIC-SOUNDSTop-1 accuracy %55.11Slow-Fast(Finetune by Fivewin team)

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Neuromorphic Wireless Split Computing with Resonate-and-Fire Neurons2025-06-24Fully Few-shot Class-incremental Audio Classification Using Multi-level Embedding Extractor and Ridge Regression Classifier2025-06-23Adaptive Differential Denoising for Respiratory Sounds Classification2025-06-03Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds2025-05-29Patient-Aware Feature Alignment for Robust Lung Sound Classification:Cohesion-Separation and Global Alignment Losses2025-05-284,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification2025-05-21