TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Retrieval/MSR-VTT-1kA

Video Retrieval on MSR-VTT-1kA

Metric: video-to-text Mean Rank (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕video-to-text Mean Rank▼Extra DataPaperDate↕Code
1CenterCLIP (ViT-B/16)10.2YesCenterCLIP: Token Clustering for Efficient Text-...2022-05-02Code
2CLIP2Video10.2YesCLIP2Video: Mastering Video-Text Retrieval via I...2021-06-21Code
3CAMoE9.9YesImproving Video-Text Retrieval by Multi-Stream C...2021-09-09Code
4PAU9.7NoPrototype-based Aleatoric Uncertainty Quantifica...2023-09-29Code
5CLIP2TV9YesCLIP2TV: Align, Match and Distill for Video-Text...2021-11-10-
6X-Pool9YesX-Pool: Cross-Modal Language-Video Attention for...2022-03-28Code
7HBI8.9NoVideo-Text as Game Players: Hierarchical Banzhaf...2023-03-25Code
8DiffusionRet8.8NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
9DiffusionRet+QB-Norm8.5NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
10X-CLIP8.1NoX-CLIP: End-to-End Multi-grained Contrastive Lea...2022-07-15Code
11Cap4Video8NoCap4Video: What Can Auxiliary Captions Do for Te...2022-12-31Code
12HunYuan_tvr7.7YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
13DRL7.6YesDisentangled Representation Learning for Text-Vi...2022-03-14Code
14PIDRo7.5No---
15DMAE (ViT-B/16)7.3NoDual-Modal Attention-Enhanced Text-Video Retriev...2023-09-20Code
16HunYuan_tvr (huge)5.5YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
17EMCL-Net2NoExpectation-Maximization Contrastive Learning fo...2022-11-21Code
18EMCL-Net++1NoExpectation-Maximization Contrastive Learning fo...2022-11-21Code