GPT-4o + CA

Reported on 6 benchmarks across 3 tasks · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Reasoning4 results

Visual ReasoningonWinoground
Text Score· 2025-01-23
75.5
SOTA
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620
Visual ReasoningonWinoground
Group Score· 2025-01-23
52
best: 58.75 (GPT-4V (CoT, pick b/w two options))
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620
Visual ReasoningonWinoground
Image Score· 2025-01-23
58.5
best: 68.75 (GPT-4V (CoT, pick b/w two options))
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620
Visual ReasoningonBongard-OpenWorld
2-Class Accuracy· 2025-01-23
92.8
best: 93.6 (Gemini-2.0 + CA)
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620

Computer Vision2 results

Image ClassificationonBongard-HOI
Avg. Accuracy· 2025-01-23
77.3
best: 91.42 (Human (Amateur))
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620
Few-Shot Image ClassificationonBongard-HOI
Avg. Accuracy· 2025-01-23
77.3
best: 91.42 (Human (Amateur))
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs arXiv:2501.13620