Zero-Shot Video Retrieval on MSR-VTT

Metric: text-to-video Mean Rank (higher is better)

LeaderboardDataset
Loading chart...
#Modeltext-to-video Mean RankExtra DataPaperDateCode
1MMT148.1YesMulti-modal Transformer for Video Retrieval2020-07-21Code
2CLIP4Clip34NoCLIP4Clip: An Empirical Study of CLIP for End to...2021-04-18Code
3MIL-NCE29.5NoEnd-to-End Learning of Visual Representations fr...2019-12-13Code