Image Retrieval with Multi-Modal Query on SoundingEarth
Metric: Image-to-sound R@100 (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Image-to-sound R@100▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GeoCLAP | 0.434 | Yes | Learning Tri-modal Embeddings for Zero-Shot Soun... | 2023-09-19 | Code |
| 2 | ResNet-18 | 0.291 | No | Self-supervised Audiovisual Representation Learn... | 2021-08-02 | Code |