Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
Audio Classification
/
ESC-50
Audio Classification on ESC-50
Metric: Top-1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Top-1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
OmniVec2
99.1
Yes
-
-
-
2
InternVideo2
98.6
Yes
InternVideo2: Scaling Foundation Models for Mult...
2024-03-22
Code
3
M2D2 AS+
98.5
Yes
M2D2: Exploring General-purpose Audio-Language R...
2025-03-28
Code
4
OmniVec
98.4
Yes
OmniVec: Learning robust representations with cr...
2023-11-07
-
5
BEATs
98.1
Yes
BEATs: Audio Pre-Training with Acoustic Tokenizers
2022-12-18
Code
6
mn40_as
97.45
Yes
Efficient Large-scale Audio Tagging via Transfor...
2022-11-09
Code
7
DyMN-L
97.4
Yes
Dynamic Convolutional Neural Networks as Efficie...
2023-10-24
Code
8
M2D-CLAP/0.7
97.4
Yes
M2D-CLAP: Masked Modeling Duo Meets CLAP for Lea...
2024-06-04
Code
9
M2D-AS/0.7
97.2
Yes
Masked Modeling Duo: Towards a Universal Audio P...
2024-04-09
Code
10
HTS-AT
97
Yes
HTS-AT: A Hierarchical Token-Semantic Audio Tran...
2022-02-02
Code
11
EAT-M
96.3
Yes
End-to-End Audio Strikes Back: Boosting Augmenta...
2022-04-25
Code
12
LHGNN
96.2
No
LHGNN: Local-Higher Order Graph Neural Networks ...
2025-01-07
-
13
ERANN-2-5
96.1
No
-
-
-
14
M2D/0.7
96
Yes
Masked Modeling Duo: Towards a Universal Audio P...
2024-04-09
Code
15
EAT
96
Yes
EAT: Self-Supervised Pre-Training with Efficient...
2024-01-07
Code
16
Audio Spectrogram Transformer
95.7
Yes
AST: Audio Spectrogram Transformer
2021-04-05
Code
17
EAT-S
95.25
Yes
End-to-End Audio Strikes Back: Boosting Augmenta...
2022-04-25
Code
18
MATPAC (SSL model, linear eval)
93.5
No
Masked Latent Prediction and Classification for ...
2025-02-17
Code
19
EAT-S (scratch)
92.15
No
End-to-End Audio Strikes Back: Boosting Augmenta...
2022-04-25
Code
20
SepTr + LeRaC
91.58
No
Learning Rate Curriculum
2022-05-18
Code
21
SepTr
91.13
No
SepTr: Separable Transformer for Audio Spectrogr...
2022-03-17
Code
22
Multi-Format Contrastive
90.5
Yes
Multi-Format Contrastive Learning of Audio Repre...
2021-03-11
-
23
Multi-Channel Audio Feature with CNN
89.5
No
-
-
-
24
AVID
89.2
No
Audio-Visual Instance Discrimination with Cross-...
2020-04-27
Code
25
ACDNet
87.1
No
Environmental Sound Classification on the Edge: ...
2021-03-05
Code
26
XDC
85.4
No
Self-Supervised Learning by Cross-Modal Audio-Vi...
2019-11-28
Code
27
XDC
84.8
No
Self-Supervised Learning by Cross-Modal Audio-Vi...
2019-11-28
Code
28
AVTS
82.3
No
Cooperative Learning of Audio and Video Models f...
2018-06-30
-
29
L3
79.3
No
Look, Listen and Learn
2017-05-23
Code