Zero-Shot Video Retrieval on MSR-VTT

Metric: video-to-text Median Rank (higher is better)

LeaderboardDataset
Loading chart...
#Modelvideo-to-text Median RankExtra DataPaperDateCode
1LaT12NoLaT: Latent Translation with Cycle-Consistency f...2022-07-11-
2LanguageBind(ViT-L/14)3YesLanguageBind: Extending Video-Language Pretraini...2023-10-03Code
3LanguageBind(ViT-H/14)2YesLanguageBind: Extending Video-Language Pretraini...2023-10-03Code