Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | VIOLETv2 | 97.6 | No | An Empirical Study of End-to-End Video-Language ... | 2022-09-04 | Code |
| 2 | HiTeA | 97.4 | No | HiTeA: Hierarchical Temporal-Aware Video-Languag... | 2022-12-30 | - |
| 3 | VindLU | 95.5 | No | VindLU: A Recipe for Effective Video-and-Languag... | 2022-12-09 | Code |
| 4 | Clover | 95.2 | No | Clover: Towards A Unified Video-Language Alignme... | 2022-07-16 | Code |
| 5 | Singularity-temporal | 93.7 | No | Revealing Single Frame Bias for Video-and-Langua... | 2022-06-07 | Code |
| 6 | Norton | 92.7 | No | Multi-granularity Correspondence Learning from L... | 2024-01-30 | Code |
| 7 | Singularity | 92.1 | No | Revealing Single Frame Bias for Video-and-Langua... | 2022-06-07 | Code |