A. Nagrani et. al.
Reported on 3 benchmarks across 1 task · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Computer Vision3 results
- text-to-video R@1· uses extra data· 2022-04-0119.4best: 55.9 (InternVideo2-6B)
- text-to-video R@10· uses extra data· 2022-04-0150.3best: 85.1 (InternVideo2-6B)
- text-to-video R@5· uses extra data· 2022-04-0139.5best: 78.3 (InternVideo2-6B)