GPT-4V (CoT, pick b/w two options)
Reported on 3 benchmarks across 1 task · 1 paper · 3 SOTA
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Reasoning3 results
- Group Score· 2023-11-15SOTA58.75
- Image Score· 2023-11-15SOTA68.75
- Text Score· 2023-11-15SOTA75.25best: 75.5 (GPT-4o + CA)