Metric: recall@5 (higher is better)
| # | Model↕ | recall@5▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | VisualSparta | 91.8 | No | VisualSparta: An Embarrassingly Simple Approach ... | 2021-01-01 | Code |
| 2 | BLIP-2 ViT-G (fine-tuned) | 87.7 | No | BLIP-2: Bootstrapping Language-Image Pre-trainin... | 2023-01-30 | Code |
| 3 | BLIP-2 ViT-L (fine-tuned) | 86.5 | No | BLIP-2: Bootstrapping Language-Image Pre-trainin... | 2023-01-30 | Code |
| 4 | FLAVA (zero-shot) | 67.47 | No | FLAVA: A Foundational Language And Vision Alignm... | 2021-12-08 | Code |
| 5 | CLIP (zero-shot) | 62.47 | No | FLAVA: A Foundational Language And Vision Alignm... | 2021-12-08 | Code |