Multimodal Reasoning on REBUS

Metric: Accuracy (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

#	Model↕	Accuracy▼	Extra Data	Paper	Date↕	Code
1	GPT-4V	24	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
2	Gemini Pro	13.2	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
3	LLaVa-1.5-13B	1.8	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
4	LLaVa-1.5-7B	1.5	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
5	BLIP2-FLAN-T5-XXL	0.9	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
6	CogVLM	0.9	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
7	QWEN	0.9	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code
8	InstructBLIP	0.6	No	REBUS: A Robust Evaluation Benchmark of Understa...	2024-01-11	Code