TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LHGNN: Local-Higher Order Graph Neural Networks For Audio ...

LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging

Shubhr Singh, Emmanouil Benetos, Huy Phan, Dan Stowell

2025-01-07Audio Classification
PaperPDF

Abstract

Transformers have set new benchmarks in audio processing tasks, leveraging self-attention mechanisms to capture complex patterns and dependencies within audio data. However, their focus on pairwise interactions limits their ability to process the higher-order relations essential for identifying distinct audio objects. To address this limitation, this work introduces the Local- Higher Order Graph Neural Network (LHGNN), a graph based model that enhances feature understanding by integrating local neighbourhood information with higher-order data from Fuzzy C-Means clusters, thereby capturing a broader spectrum of audio relationships. Evaluation of the model on three publicly available audio datasets shows that it outperforms Transformer-based models across all benchmarks while operating with substantially fewer parameters. Moreover, LHGNN demonstrates a distinct advantage in scenarios lacking ImageNet pretraining, establishing its effectiveness and efficiency in environments where extensive pretraining data is unavailable.

Results

TaskDatasetMetricValueModel
Audio ClassificationESC-50Top-1 Accuracy96.2LHGNN
Audio ClassificationAudio SetMean AP46.6LHGNN
Audio ClassificationFSD50KMean AP59LHGNN
ClassificationESC-50Top-1 Accuracy96.2LHGNN
ClassificationAudio SetMean AP46.6LHGNN
ClassificationFSD50KMean AP59LHGNN

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Neuromorphic Wireless Split Computing with Resonate-and-Fire Neurons2025-06-24Fully Few-shot Class-incremental Audio Classification Using Multi-level Embedding Extractor and Ridge Regression Classifier2025-06-23Adaptive Differential Denoising for Respiratory Sounds Classification2025-06-03Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds2025-05-29Patient-Aware Feature Alignment for Robust Lung Sound Classification:Cohesion-Separation and Global Alignment Losses2025-05-284,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification2025-05-21