TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Contrastive Learning of General-Purpose Audio Representati...

Contrastive Learning of General-Purpose Audio Representations

Aaqib Saeed, David Grangier, Neil Zeghidour

2020-10-21Speaker IdentificationContrastive Learning
PaperPDFCodeCode(official)

Abstract

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a representation which assigns high similarity to audio segments extracted from the same recording while assigning lower similarity to segments from different recordings. We build on top of recent advances in contrastive learning for computer vision and reinforcement learning to design a lightweight, easy-to-implement self-supervised model of audio. We pre-train embeddings on the large-scale Audioset database and transfer these representations to 9 diverse classification tasks, including speech, music, animal sounds, and acoustic scenes. We show that despite its simplicity, our method significantly outperforms previous self-supervised systems. We furthermore conduct ablation studies to identify key design choices and release a library to pre-train and fine-tune COLA models.

Results

TaskDatasetMetricValueModel
Speaker IdentificationVoxCeleb1Accuracy37.7COLA
Speaker IdentificationVoxCeleb1Top-1 (%)37.7COLA
Spoken Command RecognitionSpeech Command v2Accuracy95.5COLA

Related Papers

SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16LLM-Driven Dual-Level Multi-Interest Modeling for Recommendation2025-07-15Latent Space Consistency for Sparse-View CT Reconstruction2025-07-15Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding2025-07-13