Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
NewsQA
Question Answering on NewsQA
Metric: F1 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
F1 (best first)
F1 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
F1
▼
Extra Data
Paper
Date
↕
Code
1
Riple/Saanvi-v0.5-DeepAnalysis
94.01
Yes
DeepSense: A Unified Deep Learning Framework for...
2016-11-07
Code
2
OpenAI/o3-2025-01-31-high
93.13
Yes
o3-mini vs DeepSeek-R1: Which One is Safer?
2025-01-30
Code
3
OpenAI/o4-mini-2025-05-01-high
91.31
Yes
Thinking Like Transformers
2021-06-13
Code
4
OpenAI/o1-2024-12-17-high
88.72
Yes
0/1 Deep Neural Networks via Block Coordinate De...
2022-06-19
-
5
xAI/grok-3-1212
88.24
Yes
XAI for Transformers: Better Explanations throug...
2022-02-15
Code
6
deepseek-r1
86.13
Yes
DeepSeek-R1: Incentivizing Reasoning Capability ...
2025-01-22
Code
7
Riple/Saanvi-v0.1
85.44
No
Time-series Transformer Generative Adversarial N...
2022-05-23
Code
8
Anthropic/claude-3-7-sonnet
82.3
No
-
-
-
9
OpenAI/GPT-4o
81.74
Yes
GPT-4o as the Gold Standard: A Scalable and Gene...
2024-10-03
-
10
Google/Gemini 2.5 Pro
79.91
Yes
Gemini 1.5: Unlocking multimodal understanding a...
2024-03-08
Code
11
SpanBERT
73.6
No
SpanBERT: Improving Pre-training by Representing...
2019-07-24
Code
12
LinkBERT (large)
72.6
Yes
LinkBERT: Pretraining Language Models with Docum...
2022-03-29
Code
13
DyREX
68.53
Yes
DyREx: Dynamic Query Representation for Extracti...
2022-10-26
Code
14
DecaProp
66.3
No
Densely Connected Attention Propagation for Read...
2018-11-10
Code
15
BERT+ASGen
64.5
No
-
-
-
16
AMANDA
63.7
No
A Question-Focused Multi-Factor Attention Networ...
2018-01-25
Code
17
MINIMAL(Dyn)
63.2
Yes
Efficient and Robust Question Answering from Min...
2018-05-21
Code
18
FastQAExt
56.1
Yes
Making Neural QA as Simple as Possible but not S...
2017-03-14
Code
#1
Riple/Saanvi-v0.5-DeepAnalysis
SOTA
94.01
F1
· Extra Data
· 2016-11-07
DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing
Code
#2
OpenAI/o3-2025-01-31-high
93.13
F1
· Extra Data
· 2025-01-30
o3-mini vs DeepSeek-R1: Which One is Safer?
Code
#3
OpenAI/o4-mini-2025-05-01-high
91.31
F1
· Extra Data
· 2021-06-13
Thinking Like Transformers
Code
#4
OpenAI/o1-2024-12-17-high
88.72
F1
· Extra Data
· 2022-06-19
0/1 Deep Neural Networks via Block Coordinate Descent
#5
xAI/grok-3-1212
88.24
F1
· Extra Data
· 2022-02-15
XAI for Transformers: Better Explanations through Conservative Propagation
Code
#6
deepseek-r1
86.13
F1
· Extra Data
· 2025-01-22
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Code
#7
Riple/Saanvi-v0.1
85.44
F1
· 2022-05-23
Time-series Transformer Generative Adversarial Networks
Code
#8
Anthropic/claude-3-7-sonnet
82.3
F1
No paper
#9
OpenAI/GPT-4o
81.74
F1
· Extra Data
· 2024-10-03
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data
#10
Google/Gemini 2.5 Pro
79.91
F1
· Extra Data
· 2024-03-08
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Code
#11
SpanBERT
73.6
F1
· 2019-07-24
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Code
#12
LinkBERT (large)
72.6
F1
· Extra Data
· 2022-03-29
LinkBERT: Pretraining Language Models with Document Links
Code
#13
DyREX
68.53
F1
· Extra Data
· 2022-10-26
DyREx: Dynamic Query Representation for Extractive Question Answering
Code
#14
DecaProp
66.3
F1
· 2018-11-10
Densely Connected Attention Propagation for Reading Comprehension
Code
#15
BERT+ASGen
64.5
F1
No paper
#16
AMANDA
63.7
F1
· 2018-01-25
A Question-Focused Multi-Factor Attention Network for Question Answering
Code
#17
MINIMAL(Dyn)
63.2
F1
· Extra Data
· 2018-05-21
Efficient and Robust Question Answering from Minimal Context over Documents
Code
#18
FastQAExt
56.1
F1
· Extra Data
· 2017-03-14
Making Neural QA as Simple as Possible but not Simpler
Code