TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video/MSVD

Video on MSVD

Metric: video-to-text R@1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕video-to-text R@1▼Extra DataPaperDate↕Code
1InternVideo2-6B85.2YesInternVideo2: Scaling Foundation Models for Mult...2024-03-22Code
2vid-TLDR (UMT-L)82.7Yesvid-TLDR: Training Free Token merging for Light-...2024-03-20Code
3InternVideo76.3YesInternVideo: General Video Foundation Models via...2022-12-06Code
4HunYuan_tvr (huge)73YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
5Cap4Video70NoCap4Video: What Can Auxiliary Captions Do for Te...2022-12-31Code
6CAMoE69.3YesImproving Video-Text Retrieval by Multi-Stream C...2021-09-09Code
7HunYuan_tvr69.1YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
8PAU68.9NoPrototype-based Aleatoric Uncertainty Quantifica...2023-09-29Code
9CenterCLIP (ViT-B/16)68.4YesCenterCLIP: Token Clustering for Efficient Text-...2022-05-02Code
10X-CLIP66.8NoX-CLIP: End-to-End Multi-grained Contrastive Lea...2022-07-15Code
11X-Pool66.4YesX-Pool: Cross-Modal Language-Video Attention for...2022-03-28Code
12CLIP4Clip62YesCLIP4Clip: An Empirical Study of CLIP for End to...2021-04-18Code
13DiffusionRet61.9NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
14DiffusionRet+QB-Norm60.3NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
15CLIP59.9NoA Straightforward Framework For Video Retrieval ...2021-02-24Code