TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Features of Music from Scratch

Learning Features of Music from Scratch

John Thickstun, Zaid Harchaoui, Sham Kakade

2016-11-29Music TranscriptionBIG-bench Machine LearningMulti-Label Classification
PaperPDFCodeCode

Abstract

This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research. MusicNet consists of hundreds of freely-licensed classical music recordings by 10 composers, written for 11 instruments, together with instrument/note annotations resulting in over 1 million temporal labels on 34 hours of chamber music performances under various studio and microphone conditions. The paper defines a multi-label classification task to predict notes in musical recordings, along with an evaluation protocol, and benchmarks several machine learning architectures for this task: i) learning from spectrogram features; ii) end-to-end learning with a neural net; iii) end-to-end learning with a convolutional neural net. These experiments show that end-to-end models trained for note prediction learn frequency selective filters as a low-level representation of audio.

Results

TaskDatasetMetricValueModel
Music TranscriptionMusicNetAPS67.8CNN (64 stride)

Related Papers

Privacy-Preserving Chest X-ray Classification in Latent Space with Homomorphically Encrypted Neural Inference2025-06-18Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription2025-06-17Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation2025-06-17AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials2025-06-13MUDAS: Mote-scale Unsupervised Domain Adaptation in Multi-label Sound Classification2025-06-12ToxSyn-PT: A Large-Scale Synthetic Dataset for Hate Speech Detection in Portuguese2025-06-11Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis2025-06-05PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches2025-05-30