Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
NewsQA
Question Answering on NewsQA
Metric: EM (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
EM (best first)
EM (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
EM
▼
Extra Data
Paper
Date
↕
Code
1
OpenAI/o3-2025-01-31-high
92.52
Yes
o3-mini vs DeepSeek-R1: Which One is Safer?
2025-01-30
Code
2
Riple/Saanvi-v0.5-DeepAnalysis
92.14
Yes
DeepSense: A Unified Deep Learning Framework for...
2016-11-07
Code
3
OpenAI/o4-mini-2025-05-01-high
88.24
Yes
Thinking Like Transformers
2021-06-13
Code
4
OpenAI/o1-2024-12-17-high
81.44
Yes
0/1 Deep Neural Networks via Block Coordinate De...
2022-06-19
-
5
deepseek-r1
80.57
Yes
DeepSeek-R1: Incentivizing Reasoning Capability ...
2025-01-22
Code
6
Anthropic/claude-3-7-sonnet
74.23
No
-
-
-
7
Riple/Saanvi-v0.1
72.61
No
Time-series Transformer Generative Adversarial N...
2022-05-23
Code
8
xAI/grok-3-1212
70.57
Yes
XAI for Transformers: Better Explanations throug...
2022-02-15
Code
9
OpenAI/GPT-4o
70.21
Yes
GPT-4o as the Gold Standard: A Scalable and Gene...
2024-10-03
-
10
Google/Gemini 2.5 Pro
68.75
Yes
Gemini 1.5: Unlocking multimodal understanding a...
2024-03-08
Code
11
BERT+ASGen
54.7
No
-
-
-
12
DecaProp
53.1
No
Densely Connected Attention Propagation for Read...
2018-11-10
Code
13
MINIMAL(Dyn)
50.1
Yes
Efficient and Robust Question Answering from Min...
2018-05-21
Code
14
AMANDA
48.4
No
A Question-Focused Multi-Factor Attention Networ...
2018-01-25
Code
15
FastQAExt
43.7
Yes
Making Neural QA as Simple as Possible but not S...
2017-03-14
Code
#1
OpenAI/o3-2025-01-31-high
SOTA
92.52
EM
· Extra Data
· 2025-01-30
o3-mini vs DeepSeek-R1: Which One is Safer?
Code
#2
Riple/Saanvi-v0.5-DeepAnalysis
SOTA
92.14
EM
· Extra Data
· 2016-11-07
DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing
Code
#3
OpenAI/o4-mini-2025-05-01-high
88.24
EM
· Extra Data
· 2021-06-13
Thinking Like Transformers
Code
#4
OpenAI/o1-2024-12-17-high
81.44
EM
· Extra Data
· 2022-06-19
0/1 Deep Neural Networks via Block Coordinate Descent
#5
deepseek-r1
80.57
EM
· Extra Data
· 2025-01-22
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Code
#6
Anthropic/claude-3-7-sonnet
74.23
EM
No paper
#7
Riple/Saanvi-v0.1
72.61
EM
· 2022-05-23
Time-series Transformer Generative Adversarial Networks
Code
#8
xAI/grok-3-1212
70.57
EM
· Extra Data
· 2022-02-15
XAI for Transformers: Better Explanations through Conservative Propagation
Code
#9
OpenAI/GPT-4o
70.21
EM
· Extra Data
· 2024-10-03
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data
#10
Google/Gemini 2.5 Pro
68.75
EM
· Extra Data
· 2024-03-08
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Code
#11
BERT+ASGen
54.7
EM
No paper
#12
DecaProp
53.1
EM
· 2018-11-10
Densely Connected Attention Propagation for Reading Comprehension
Code
#13
MINIMAL(Dyn)
50.1
EM
· Extra Data
· 2018-05-21
Efficient and Robust Question Answering from Minimal Context over Documents
Code
#14
AMANDA
48.4
EM
· 2018-01-25
A Question-Focused Multi-Factor Attention Network for Question Answering
Code
#15
FastQAExt
43.7
EM
· Extra Data
· 2017-03-14
Making Neural QA as Simple as Possible but not Simpler
Code