TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Visual Question Answering (VQA)/A-OKVQA

Visual Question Answering (VQA) on A-OKVQA

Metric: DA VQA Score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕DA VQA Score▼Extra DataPaperDate↕Code
1SMoLA-PaLI-X Specialist Model70.55YesOmni-SMoLA: Boosting Generalist Multimodal Model...2023-12-01-
2PaLI-X-VPD68.2NoVisual Program Distillation: Distilling Tools an...2023-12-05-
3PromptCap59.6NoPromptCap: Prompt-Guided Task-Aware Image Captio...2022-11-15Code
4Prophet58.5NoProphet: Prompting Large Language Models with Co...2023-03-03Code
5A Simple Baseline for KB-VQA57.5NoA Simple Baseline for Knowledge-Based Visual Que...2023-10-20-
6KRISP42.2NoKRISP: Integrating Implicit and Symbolic Knowled...2020-12-20-
7GPV-240.7NoWebly Supervised Concept Expansion for General P...2022-02-04-
8VLC-BERT38.05NoVLC-BERT: Visual Question Answering with Context...2022-10-24Code
9LXMERT25.9NoLXMERT: Learning Cross-Modality Encoder Represen...2019-08-20Code
10ViLBERT25.9NoViLBERT: Pretraining Task-Agnostic Visiolinguist...2019-08-06Code
11Pythia21.9NoPythia v0.1: the Winning Entry to the VQA Challe...2018-07-26Code
12ViLBERT - VQA12NoViLBERT: Pretraining Task-Agnostic Visiolinguist...2019-08-06Code
13ViLBERT - OK-VQA9.2NoViLBERT: Pretraining Task-Agnostic Visiolinguist...2019-08-06Code