Metric: text-to-video Mean Rank (higher is better)
| # | Model↕ | text-to-video Mean Rank▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | JEMC | 213.8 | No | - | - | Code |
| 2 | Collaborative Experts | 86.8 | No | Use What You Have: Video Retrieval Using Represe... | 2019-07-31 | Code |
| 3 | MDMMT | 52.8 | Yes | MDMMT: Multidomain Multimodal Transformer for Vi... | 2021-03-19 | Code |
| 4 | CLIP2Video | 45.4 | Yes | CLIP2Video: Mastering Video-Text Retrieval via I... | 2021-06-21 | Code |
| 5 | CLIP2TV | 44.7 | Yes | CLIP2TV: Align, Match and Distill for Video-Text... | 2021-11-10 | - |
| 6 | CAMoE | 42.6 | Yes | Improving Video-Text Retrieval by Multi-Stream C... | 2021-09-09 | Code |
| 7 | MDMMT-2 | 37.8 | Yes | MDMMT-2: Multidomain Multimodal Transformer for ... | 2022-03-14 | - |