TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Visual Question Answering (VQA)/BenchLMM

Visual Question Answering (VQA) on BenchLMM

Metric: GPT-3.5 score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕GPT-3.5 score▼Extra DataPaperDate↕Code
1GPT-4V58.37YesGPT-4 Technical Report2023-03-15Code
2Sphinx-V2-1K57.43YesSPHINX: The Joint Mixing of Weights, Tasks, and ...2023-11-13Code
3LLaVA-1.5-13B55.53NoImproved Baselines with Visual Instruction Tuning2023-10-05Code
4LLaVA-1.5-7B46.83NoVisual Instruction Tuning2023-04-17Code
5InstructBLIP-13B45.03NoInstructBLIP: Towards General-purpose Vision-Lan...2023-05-11Code
6InstructBLIP-7B44.63NoInstructBLIP: Towards General-purpose Vision-Lan...2023-05-11Code
7LLaVA-1-13B43.5NoVisual Instruction Tuning2023-04-17Code
8Otter-7B39.13NoOtter: A Multi-Modal Model with In-Context Instr...2023-05-05Code
9MiniGPT4-13B34.93NoMiniGPT-4: Enhancing Vision-Language Understandi...2023-04-20Code
10MiniGPTv2-7B30.1NoMiniGPT-v2: large language model as a unified in...2023-10-14Code