Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Retrieval
/
VATEX
Video Retrieval on VATEX
Metric: text-to-video R@1 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
text-to-video R@1
▼
Extra Data
Paper
Date
↕
Code
1
GRAM
87.7
Yes
Gramian Multimodal Representation Learning and A...
2024-12-16
Code
2
VAST
83
Yes
VAST: A Vision-Audio-Subtitle-Text Omni-Modality...
2023-05-29
Code
3
VALOR
78.5
Yes
VALOR: Vision-Audio-Language Omni-Perception Pre...
2023-04-17
Code
4
InternVideo2-6B
75.5
Yes
InternVideo2: Scaling Foundation Models for Mult...
2024-03-22
Code
5
Unmasked Teacher
72
No
Unmasked Teacher: Towards Training-Efficient Vid...
2023-03-28
Code
6
InternVideo
71.1
No
InternVideo: General Video Foundation Models via...
2022-12-06
Code
7
Side4Video
68.8
No
Side4Video: Spatial-Temporal Side Network for Me...
2023-11-27
Code
8
Cap4Video
66.6
No
Cap4Video: What Can Auxiliary Captions Do for Te...
2022-12-31
Code
9
TeachCLIP
63.6
No
-
-
Code
10
TS2-Net
59.1
No
TS2-Net: Token Shift and Selection Transformer f...
2022-07-16
Code
11
LAFF
59.1
No
Lightweight Attentional Feature Fusion: A New Ba...
2021-12-03
Code
12
QB-Norm+CLIP2Video
58.8
Yes
Cross Modal Retrieval with Querybank Normalisation
2021-12-23
Code
13
CLIP2Video
57.3
Yes
CLIP2Video: Mastering Video-Text Retrieval via I...
2021-06-21
Code