Visual Question Answering (VQA) on AI2D
Metric: EM (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | EM▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | SMoLA-PaLI-X Specialist Model | 82.5 | Yes | Omni-SMoLA: Boosting Generalist Multimodal Model... | 2023-12-01 | - |
| 2 | SMoLA-PaLI-X Generalist Model | 81.4 | Yes | Omni-SMoLA: Boosting Generalist Multimodal Model... | 2023-12-01 | - |
| 3 | Gemini Ultra | 79.5 | No | Gemini: A Family of Highly Capable Multimodal Mo... | 2023-12-19 | Code |
| 4 | DUBLIN | 51.11 | No | DUBLIN -- Document Understanding By Language-Ima... | 2023-05-23 | - |