Multi-modal Classification on VGG-Sound
Metric: Top-1 Accuracy (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Top-1 Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | MMT | 66.2 | No | - | - | - |
| 2 | CAV-MAE (Audio-Visual) | 65.9 | Yes | Contrastive Audio-Visual Masked Autoencoder | 2022-10-02 | Code |
| 3 | UAVM | 65.8 | Yes | UAVM: Towards Unifying Audio and Visual Models | 2022-07-29 | Code |
| 4 | AVT | 63.9 | No | - | - | - |