TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Zero-Shot Video Retrieval

Zero-Shot Video Retrieval

57 benchmarks40 papers

Zero-shot video retrieval is the task of retrieving relevant videos based on a query (usually in text form) without any prior training on specific examples of those videos. Unlike traditional retrieval methods that rely on supervised learning with annotated datasets, zero-shot retrieval leverages pre-trained models, typically based on large-scale vision-language learning, to understand semantic relationships between textual descriptions and video content.

This approach enables retrieval of unseen video concepts by generalizing knowledge from diverse training data, making it highly useful for domains with limited labeled data, such as broadcast media, surveillance, and historical archives.

Benchmarks

Zero-Shot Video Retrieval on MSR-VTT

text-to-video R@1text-to-video R@5text-to-video R@10text-to-video Median Rankvideo-to-text R@1video-to-text R@10video-to-text R@5text-to-video Mean Rankvideo-to-text Median Rank

Zero-Shot Video Retrieval on DiDeMo

text-to-video R@1text-to-video R@5text-to-video R@10video-to-text R@1video-to-text R@10text-to-video Median Rankvideo-to-text R@5video-to-text Median Rank

Zero-Shot Video Retrieval on LSMDC

text-to-video R@1text-to-video R@5text-to-video R@10text-to-video Median Rankvideo-to-text R@1video-to-text R@5video-to-text R@10text-to-video Mean Rank

Zero-Shot Video Retrieval on MSVD

text-to-video R@1text-to-video R@5text-to-video R@10text-to-video Median Rankvideo-to-text R@1video-to-text R@5video-to-text R@10video-to-text Median Ranktext-to-video Mean Rank

Zero-Shot Video Retrieval on ActivityNet

text-to-video R@1text-to-video R@10text-to-video R@5video-to-text R@1video-to-text R@10video-to-text R@5

Zero-Shot Video Retrieval on YouCook2

text-to-video R@10text-to-video R@1text-to-video R@5text-to-video Mean Ranktext-to-video Median Rank

Zero-Shot Video Retrieval on VATEX

text-to-video R@1video-to-text R@1text-to-video R@10video-to-text R@10text-to-video R@5video-to-text R@5

Zero-Shot Video Retrieval on MSR-VTT-full

text-to-video R@1text-to-video R@5text-to-video R@10video-to-text R@1video-to-text R@5video-to-text R@10