TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video/ActivityNet

Video on ActivityNet

Metric: text-to-video R@10 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕text-to-video R@10▼Extra DataPaperDate↕Code
1GRAM96.1YesGramian Multimodal Representation Learning and A...2024-12-16Code
2VAST95.5YesVAST: A Vision-Audio-Subtitle-Text Omni-Modality...2023-05-29Code
3VALOR95.3YesVALOR: Vision-Audio-Language Omni-Perception Pre...2023-04-17Code
4UMT-L (ViT-L/16)94.9YesUnmasked Teacher: Towards Training-Efficient Vid...2023-03-28Code
5vid-TLDR (UMT-L)94.4Yesvid-TLDR: Training Free Token merging for Light-...2024-03-20Code
6HunYuan_tvr93.1YesTencent Text-Video Retrieval: Hierarchical Cross...2022-04-07-
7CLIP-ViP92.6YesCLIP-ViP: Adapting Pre-trained Image-Text Model ...2022-09-14Code
8RTQ91.9NoRTQ: Rethinking Video-language Understanding Bas...2023-12-01Code
9VindLU89.7YesVindLU: A Recipe for Effective Video-and-Languag...2022-12-09Code
10TESTA (ViT-B/16)89.6YesTESTA: Temporal-Spatial Token Aggregation for Lo...2023-10-29Code
11DMAE (ViT-B/32)89.2NoDual-Modal Attention-Enhanced Text-Video Retriev...2023-09-20Code
12CAMoE87.6YesImproving Video-Text Retrieval by Multi-Stream C...2021-09-09Code
13CenterCLIP (ViT-B/16)87.6YesCenterCLIP: Token Clustering for Efficient Text-...2022-05-02Code
14HiTeA86.7YesHiTeA: Hierarchical Temporal-Aware Video-Languag...2022-12-30-
15DiffusionRet86.3NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
16DiffusionRet+QB-Norm85.7NoDiffusionRet: Generative Text-Video Retrieval wi...2023-03-17Code
17Singularity85.5YesRevealing Single Frame Bias for Video-and-Langua...2022-06-07Code
18HBI84.6NoVideo-Text as Game Players: Hierarchical Banzhaf...2023-03-25Code
19Collaborative Experts63.9NoUse What You Have: Video Retrieval Using Represe...2019-07-31Code