Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Visual Question Answering (VQA)
/
A-OKVQA
Visual Question Answering (VQA) on A-OKVQA
Metric: MC Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
MC Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
SMoLA-PaLI-X Specialist Model
83.75
Yes
Omni-SMoLA: Boosting Generalist Multimodal Model...
2023-12-01
-
2
PaLI-X-VPD
80.4
No
Visual Program Distillation: Distilling Tools an...
2023-12-05
-
3
Prophet
75.1
No
Prophet: Prompting Large Language Models with Co...
2023-03-03
Code
4
PromptCap
73.2
No
PromptCap: Prompt-Guided Task-Aware Image Captio...
2022-11-15
Code
5
MC-CoT
71
No
Boosting the Power of Small Multimodal Reasoning...
2023-11-23
Code
6
HYDRA
56.35
No
HYDRA: A Hyper Agent for Dynamic Compositional V...
2024-03-19
Code
7
GPV-2
53.7
No
Webly Supervised Concept Expansion for General P...
2022-02-04
-
8
KRISP
42.2
No
KRISP: Integrating Implicit and Symbolic Knowled...
2020-12-20
-
9
ViLBERT - VQA
42.1
No
ViLBERT: Pretraining Task-Agnostic Visiolinguist...
2019-08-06
Code
10
LXMERT
41.6
No
LXMERT: Learning Cross-Modality Encoder Represen...
2019-08-20
Code
11
ViLBERT
41.5
No
ViLBERT: Pretraining Task-Agnostic Visiolinguist...
2019-08-06
Code
12
Pythia
40.1
No
Pythia v0.1: the Winning Entry to the VQA Challe...
2018-07-26
Code
13
ViLBERT - OK-VQA
34.1
No
ViLBERT: Pretraining Task-Agnostic Visiolinguist...
2019-08-06
Code