TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Question Answering/IntentQA

Question Answering on IntentQA

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1ENTER71.5NoENTER: Event Based Interpretable Reasoning for V...2025-01-24-
2LVNet71.1NoToo Many Frames, Not All Useful: Efficient Strat...2024-06-13Code
3TS-LLaVA-34B67.9NoTS-LLaVA: Constructing Visual Tokens through Thu...2024-11-17Code
4VidCtx (7B)67.1NoVidCtx: Context-aware Video Question Answering w...2024-12-23Code
5VideoTree (GPT4)66.9NoVideoTree: Adaptive Tree-based Video Representat...2024-05-29Code
6IG-VLM65.3NoAn Image Grid Can Be Worth a Video: Zero-shot Vi...2024-03-27Code
7LLoVi (GPT-4)64NoA Simple LLM Framework for Long-Range Video Ques...2023-12-28Code
8SeViLA (4B)60.9YesSelf-Chained Image-Language Model for Video Loca...2023-05-11Code
9SlowFast-LLaVA-34B60.1NoSlowFast-LLaVA: A Strong Training-Free Baseline ...2024-07-22Code
10LangRepo (12B)59.1NoLanguage Repository for Long Video Understanding2024-03-21Code
11LLoVi (7B)53.6NoA Simple LLM Framework for Long-Range Video Ques...2023-12-28Code
12Mistral (7B)50.4NoMistral 7B2023-10-10Code
13Random20NoCREPE: Can Vision-Language Foundation Models Rea...2022-12-13Code