Image Captioning on nocaps val

Metric: CIDEr (higher is better)

LeaderboardDataset
Loading chart...
#ModelCIDErExtra DataPaperDateCode
1Prismer107.9NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
2MetaLM58.7NoLanguage Models are General-Purpose Interfaces2022-06-13Code
3VL-T54.4NoUnifying Vision-and-Language Tasks via Text Gene...2021-02-04Code