Visual Question Answering (VQA) on BenchLMM

Metric: GPT-3.5 score (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Hide extra data

#	Model↕	GPT-3.5 score▼	Extra Data	Paper	Date↕	Code
1	GPT-4V	58.37	Yes	GPT-4 Technical Report	2023-03-15	Code
2	Sphinx-V2-1K	57.43	Yes	SPHINX: The Joint Mixing of Weights, Tasks, and ...	2023-11-13	Code
3	LLaVA-1.5-13B	55.53	No	Improved Baselines with Visual Instruction Tuning	2023-10-05	Code
4	LLaVA-1.5-7B	46.83	No	Visual Instruction Tuning	2023-04-17	Code
5	InstructBLIP-13B	45.03	No	InstructBLIP: Towards General-purpose Vision-Lan...	2023-05-11	Code
6	InstructBLIP-7B	44.63	No	InstructBLIP: Towards General-purpose Vision-Lan...	2023-05-11	Code
7	LLaVA-1-13B	43.5	No	Visual Instruction Tuning	2023-04-17	Code
8	Otter-7B	39.13	No	Otter: A Multi-Modal Model with In-Context Instr...	2023-05-05	Code
9	MiniGPT4-13B	34.93	No	MiniGPT-4: Enhancing Vision-Language Understandi...	2023-04-20	Code
10	MiniGPTv2-7B	30.1	No	MiniGPT-v2: large language model as a unified in...	2023-10-14	Code