Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GPT-4V | 24 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 2 | Gemini Pro | 13.2 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 3 | LLaVa-1.5-13B | 1.8 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 4 | LLaVa-1.5-7B | 1.5 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 5 | BLIP2-FLAN-T5-XXL | 0.9 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 6 | CogVLM | 0.9 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 7 | QWEN | 0.9 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |
| 8 | InstructBLIP | 0.6 | No | REBUS: A Robust Evaluation Benchmark of Understa... | 2024-01-11 | Code |