TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Retrieval/MSR-VTT-1kA

Video Retrieval on MSR-VTT-1kA

Metric: text-to-video Mean Rank (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕text-to-video Mean Rank▼Extra DataPaperDate↕Code
1Collaborative Experts28.2YesUse What You Have: Video Retrieval Using Represe...2019-07-31Code
2MMT26.7NoMulti-modal Transformer for Video Retrieval2020-07-21Code
3MMT-Pretrained24YesMulti-modal Transformer for Video Retrieval2020-07-21Code
4MDMMT16.5YesMDMMT: Multidomain Multimodal Transformer for Vi...2021-03-19Code
5CLIP4Clip15.3YesCLIP4Clip: An Empirical Study of CLIP for End to...2021-04-18Code
6CLIP2Video14.6YesCLIP2Video: Mastering Video-Text Retrieval via I...2021-06-21Code
7X-Pool14.3YesX-Pool: Cross-Modal Language-Video Attention for...2022-03-28Code
8PAU14NoPrototype-based Aleatoric Uncertainty Quantifica...2023-09-29Code
9CenterCLIP (ViT-B/16)13.8YesCenterCLIP: Token Clustering for Efficient Text-...2022-05-02Code
10CLIP2TV12.8YesCLIP2TV: Align, Match and Distill for Video-Text...2021-11-10-
11Side4Video12.8NoSide4Video: Spatial-Temporal Side Network for Me...2023-11-27Code
12Cap4Video12.4NoCap4Video: What Can Auxiliary Captions Do for Te...2022-12-31Code
13CAMoE12.4YesImproving Video-Text Retrieval by Multi-Stream C...2021-09-09Code
14X-CLIP12.2NoX-CLIP: End-to-End Multi-grained Contrastive Lea...2022-07-15Code
15DiffusionRet12.1NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
16DiffusionRet+QB-Norm12.1NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
17HBI12NoVideo-Text as Game Players: Hierarchical Banzhaf...2023-03-25Code
18DRL11.4YesDisentangled Representation Learning for Text-Vi...2022-03-14Code
19PIDRo10.7No---
20DMAE (ViT-B/16)10NoDual-Modal Attention-Enhanced Text-Video Retriev...2023-09-20Code
21HunYuan_tvr (huge)9.3YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
22EMCL-Net2NoExpectation-Maximization Contrastive Learning fo...2022-11-21Code
23EMCL-Net++1NoExpectation-Maximization Contrastive Learning fo...2022-11-21Code