Image Captioning on Conceptual Captions

Metric: CIDEr (higher is better)

LeaderboardDataset
Loading chart...
#ModelCIDErExtra DataPaperDateCode
1ClipCap (MLP + GPT2 tuning)87.26NoClipCap: CLIP Prefix for Image Captioning2021-11-18Code
2ClipCap (Transformer)71.82NoClipCap: CLIP Prefix for Image Captioning2021-11-18Code