TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Audio Embeddings as Teachers for Music Classification

Audio Embeddings as Teachers for Music Classification

Yiwei Ding, Alexander Lerch

2023-06-30Transfer LearningMusic Auto-TaggingInformation RetrievalRetrievalClassificationKnowledge DistillationMusic ClassificationMusic Information RetrievalInstrument Recognition
PaperPDFCode(official)

Abstract

Music classification has been one of the most popular tasks in the field of music information retrieval. With the development of deep learning models, the last decade has seen impressive improvements in a wide range of classification tasks. However, the increasing model complexity makes both training and inference computationally expensive. In this paper, we integrate the ideas of transfer learning and feature-based knowledge distillation and systematically investigate using pre-trained audio embeddings as teachers to guide the training of low-complexity student networks. By regularizing the feature space of the student networks with the pre-trained embeddings, the knowledge in the teacher embeddings can be transferred to the students. We use various pre-trained audio embeddings and test the effectiveness of the method on the tasks of musical instrument classification and music auto-tagging. Results show that our method significantly improves the results in comparison to the identical model trained without the teacher's knowledge. This technique can also be combined with classical knowledge distillation approaches to further improve the model's performance.

Results

TaskDatasetMetricValueModel
Music Auto-TaggingMagnaTagATune (clean)PR-AUC46.1EAsT-KD + PaSST
Music Auto-TaggingMagnaTagATune (clean)ROC-AUC91.5EAsT-KD + PaSST
Music Auto-TaggingMagnaTagATune (clean)PR-AUC45.9EAsT-Final + PaSST
Music Auto-TaggingMagnaTagATune (clean)ROC-AUC91.2EAsT-Final + PaSST
Instrument RecognitionOpenMIC-2018mean average precision0.852EAsT-KD + PaSST
Instrument RecognitionOpenMIC-2018mean average precision0.847EAsT-Final + PaSST

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17