Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Visual Question Answering (VQA)
/
ChartQA
Visual Question Answering (VQA) on ChartQA
Metric: 1:1 Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
1:1 Accuracy (best first)
1:1 Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
1:1 Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
ChartPaLI-5B + PaLM 2-S
81.3
Yes
Chart-based Reasoning: Transferring Capabilities...
2024-03-19
-
2
Gemini Ultra
80.8
No
Gemini: A Family of Highly Capable Multimodal Mo...
2023-12-19
Code
3
DePlot+FlanPaLM+Codex (PoT Self-Consistency)
79.3
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
4
ChartPaLI-5B
77.3
Yes
Chart-based Reasoning: Transferring Capabilities...
2024-03-19
-
5
DePlot+Codex (PoT Self-Consistency)
76.7
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
6
ScreenAI 5B (4.62 B params, w/ OCR)
76.7
Yes
ScreenAI: A Vision-Language Model for UI and Inf...
2024-02-07
Code
7
SMoLA-PaLI-X Specialist Model
74.6
Yes
Omni-SMoLA: Boosting Generalist Multimodal Model...
2023-12-01
-
8
SMoLA-PaLI-X Generalist Model
73.8
Yes
Omni-SMoLA: Boosting Generalist Multimodal Model...
2023-12-01
-
9
MatCha4096 + LaMenDa
72.64
Yes
-
-
-
10
PaLI-X (Single-task FT w/ OCR)
72.3
Yes
PaLI-X: On Scaling up a Multilingual Vision and ...
2023-05-29
Code
11
PaLI-X (Single-task FT)
70.9
Yes
PaLI-X: On Scaling up a Multilingual Vision and ...
2023-05-29
Code
12
PaLI-X (Multi-task FT)
70.6
Yes
PaLI-X: On Scaling up a Multilingual Vision and ...
2023-05-29
Code
13
DePlot+FlanPaLM (Self-Consistency)
70.5
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
14
PaLI-3
70
No
PaLI-3 Vision Language Models: Smaller, Faster, ...
2023-10-13
Code
15
PaLI-3 (w/ OCR)
69.5
No
PaLI-3 Vision Language Models: Smaller, Faster, ...
2023-10-13
Code
16
DePlot+FlanPaLM (CoT)
67.3
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
17
Qwen-VL-Chat
66.3
Yes
Qwen-VL: A Versatile Vision-Language Model for U...
2023-08-24
Code
18
UniChart
66.24
Yes
UniChart: A Universal Vision-language Pretrained...
2023-05-24
Code
19
Qwen-VL
65.7
Yes
Qwen-VL: A Versatile Vision-Language Model for U...
2023-08-24
Code
20
StructChart+GPT3.5 (STR ChartQA+SimChart9K)
65.3
Yes
StructChart: On the Schema, Metric, and Augmenta...
2023-09-20
Code
21
MatCha
64.2
No
MatCha: Enhancing Visual Language Pretraining wi...
2022-12-19
Code
22
StructChart+GPT3.5 (STR)
60.7
No
StructChart: On the Schema, Metric, and Augmenta...
2023-09-20
Code
23
Pix2Struct-large
58.6
No
Pix2Struct: Screenshot Parsing as Pretraining fo...
2022-10-07
Code
24
Pix2Struct-base
56
No
Pix2Struct: Screenshot Parsing as Pretraining fo...
2022-10-07
Code
25
VisionTapas-OCR
45.5
No
ChartQA: A Benchmark for Question Answering abou...
2022-03-19
Code
26
DePlot+GPT3 (Self-Consistency)
42.3
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
27
DePlot+GPT3 (CoT)
36.9
No
DePlot: One-shot visual language reasoning by pl...
2022-12-20
Code
#1
ChartPaLI-5B + PaLM 2-S
SOTA
81.3
1:1 Accuracy
· Extra Data
· 2024-03-19
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
#2
Gemini Ultra
SOTA
80.8
1:1 Accuracy
· 2023-12-19
Gemini: A Family of Highly Capable Multimodal Models
Code
#3
DePlot+FlanPaLM+Codex (PoT Self-Consistency)
SOTA
79.3
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code
#4
ChartPaLI-5B
77.3
1:1 Accuracy
· Extra Data
· 2024-03-19
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
#5
DePlot+Codex (PoT Self-Consistency)
76.7
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code
#6
ScreenAI 5B (4.62 B params, w/ OCR)
76.7
1:1 Accuracy
· Extra Data
· 2024-02-07
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Code
#7
SMoLA-PaLI-X Specialist Model
74.6
1:1 Accuracy
· Extra Data
· 2023-12-01
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
#8
SMoLA-PaLI-X Generalist Model
73.8
1:1 Accuracy
· Extra Data
· 2023-12-01
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
#9
MatCha4096 + LaMenDa
72.64
1:1 Accuracy
· Extra Data
No paper
#10
PaLI-X (Single-task FT w/ OCR)
72.3
1:1 Accuracy
· Extra Data
· 2023-05-29
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Code
#11
PaLI-X (Single-task FT)
70.9
1:1 Accuracy
· Extra Data
· 2023-05-29
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Code
#12
PaLI-X (Multi-task FT)
70.6
1:1 Accuracy
· Extra Data
· 2023-05-29
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Code
#13
DePlot+FlanPaLM (Self-Consistency)
70.5
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code
#14
PaLI-3
70
1:1 Accuracy
· 2023-10-13
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Code
#15
PaLI-3 (w/ OCR)
69.5
1:1 Accuracy
· 2023-10-13
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Code
#16
DePlot+FlanPaLM (CoT)
67.3
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code
#17
Qwen-VL-Chat
66.3
1:1 Accuracy
· Extra Data
· 2023-08-24
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Code
#18
UniChart
66.24
1:1 Accuracy
· Extra Data
· 2023-05-24
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Code
#19
Qwen-VL
65.7
1:1 Accuracy
· Extra Data
· 2023-08-24
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Code
#20
StructChart+GPT3.5 (STR ChartQA+SimChart9K)
65.3
1:1 Accuracy
· Extra Data
· 2023-09-20
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding
Code
#21
MatCha
SOTA
64.2
1:1 Accuracy
· 2022-12-19
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
Code
#22
StructChart+GPT3.5 (STR)
60.7
1:1 Accuracy
· 2023-09-20
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding
Code
#23
Pix2Struct-large
SOTA
58.6
1:1 Accuracy
· 2022-10-07
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Code
#24
Pix2Struct-base
56
1:1 Accuracy
· 2022-10-07
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Code
#25
VisionTapas-OCR
SOTA
45.5
1:1 Accuracy
· 2022-03-19
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
Code
#26
DePlot+GPT3 (Self-Consistency)
42.3
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code
#27
DePlot+GPT3 (CoT)
36.9
1:1 Accuracy
· 2022-12-20
DePlot: One-shot visual language reasoning by plot-to-table translation
Code