Image-to-Text Retrieval on WHOOPS!
Metric: Specificity (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Specificity▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | BLIP2 FlanT5-XXL (Text-only FT) | 94 | Yes | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 2 | BLIP2 FlanT5-XXL (Fine-tuned) | 84 | Yes | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 3 | BLIP2 FlanT5-XL (Fine-tuned) | 81 | Yes | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 4 | BLIP Large | 77 | No | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 5 | CoCa ViT-L-14 MSCOCO | 72 | No | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 6 | BLIP2 FlanT5-XXL (Zero-shot) | 71 | No | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |
| 7 | CLIP ViT-L/14 | 70 | No | Breaking Common Sense: WHOOPS! A Vision-and-Lang... | 2023-03-13 | - |