Video Question Answering on Zero-shot Video Question Answering on LongVideoBench
Metric: Accuracy (% ) (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Accuracy (% )▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Gemini 1.5 Pro | 66.7 | Yes | Gemini 1.5: Unlocking multimodal understanding a... | 2024-03-08 | Code |
| 2 | Video-RAG (based on LLaVA-Video) | 65.4 | Yes | Video-RAG: Visually-aligned Retrieval-Augmented ... | 2024-11-20 | Code |
| 3 | GPT-4o | 64 | Yes | GPT-4o: Visual perception performance of multimo... | 2024-06-14 | - |
| 4 | LLaVA-Video | 61.9 | Yes | Video Instruction Tuning With Synthetic Data | 2024-10-03 | - |