TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Reasoning/Video Question Answering/TGIF-QA

Video Question Answering on TGIF-QA

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1Tarsier (34B)82.5NoTarsier: Recipes for Training and Evaluating Lar...2024-06-30Code
2LinVT-Qwen2-VL (7B)81.3NoLinVT: Empower Your Image-level Large Language M...2024-12-06Code
3TS-LLaVA-34B81NoTS-LLaVA: Constructing Visual Tokens through Thu...2024-11-17Code
4PLLaVA80.6NoPLLaVA : Parameter-free LLaVA Extension from Ima...2024-04-25Code
5SlowFast-LLaVA-34B80.6NoSlowFast-LLaVA: A Strong Training-Free Baseline ...2024-07-22Code
6IG-VLM79.1NoAn Image Grid Can Be Worth a Video: Zero-shot Vi...2024-03-27Code
7VideoGPT+74.6NoVideoGPT+: Integrating Image and Video Encoders ...2024-06-13Code
8MiniGPT4-video-7B72.22NoMiniGPT4-Video: Advancing Multimodal LLMs for Vi...2024-04-04Code
9Video-LLaVA-7B70NoVideo-LLaVA: Learning United Visual Representati...2023-11-16Code
10Chat-UniVi-7B69NoChat-UniVi: Unified Visual Representation Empowe...2023-11-14Code
11Elysium66.6NoElysium: Exploring Object-level Perception in Vi...2024-03-25Code
12LocVLM-Vid-B51.8NoLearning to Localize Objects Improves Spatial Re...2024-04-11Code
13Video-ChatGPT-7B51.4NoVideo-ChatGPT: Towards Detailed Video Understand...2023-06-08Code
14FrozenBiLM41.9NoZero-Shot Video Question Answering via Frozen Bi...2022-06-16Code
15Video Chat-7B34.4NoVideoChat: Chat-Centric Video Understanding2023-05-10Code