TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multi-label Music Genre Classification from Audio, Text, a...

Multi-label Music Genre Classification from Audio, Text, and Images Using Deep Features

Oramas Sergio, Nieto Oriol, Barbieri Francesco, Serra Xavier

2017-07-16Music Genre ClassificationGeneral ClassificationGenre classification
PaperPDFCode(official)

Abstract

Music genres allow to categorize musical items that share common characteristics. Although these categories are not mutually exclusive, most related research is traditionally focused on classifying tracks into a single class. Furthermore, these categories (e.g., Pop, Rock) tend to be too broad for certain applications. In this work we aim to expand this task by categorizing musical items into multiple and fine-grained labels, using three different data modalities: audio, text, and images. To this end we present MuMu, a new dataset of more than 31k albums classified into 250 genre classes. For every album we have collected the cover image, text reviews, and audio tracks. Additionally, we propose an approach for multi-label genre classification based on the combination of feature embeddings learned with state-of-the-art deep learning methodologies. Experiments show major differences between modalities, which not only introduce new baselines for multi-label genre classification, but also suggest that combining them yields improved results.

Results

TaskDatasetMetricValueModel
Image ClassificationFMACNN855cnn

Related Papers

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following2025-06-14Recognizing Ornaments in Vocal Indian Art Music with Active Annotation2025-05-07An Adaptive Data-Resilient Multi-Modal Framework for Hierarchical Multi-Label Book Genre Identification2025-05-05Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation2025-04-18Progressive Rock Music Classification2025-04-15Specialized text classification: an approach to classifying Open Banking transactions2025-04-10Predicting Movie Production Years through Facial Recognition of Actors with Machine Learning2025-04-01M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP2025-03-28