Video Captioning on TVC

Metric: CIDEr (higher is better)

LeaderboardDataset
Loading chart...
#ModelCIDErExtra DataPaperDateCode
1VAST74.1YesVAST: A Vision-Audio-Subtitle-Text Omni-Modality...2023-05-29Code
2COSA70.7YesCOSA: Concatenated Sample Pretrained Vision-Lang...2023-06-15Code