Video Question Answering on Zero-shot Video Question Answering on LongVideoBench

Metric: Accuracy (% ) (higher is better)

LeaderboardDataset
Loading chart...
#ModelAccuracy (% )Extra DataPaperDateCode
1Gemini 1.5 Pro66.7YesGemini 1.5: Unlocking multimodal understanding a...2024-03-08Code
2Video-RAG (based on LLaVA-Video)65.4YesVideo-RAG: Visually-aligned Retrieval-Augmented ...2024-11-20Code
3GPT-4o64YesGPT-4o: Visual perception performance of multimo...2024-06-14-
4LLaVA-Video61.9YesVideo Instruction Tuning With Synthetic Data2024-10-03-