Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
SQuAD1.1 dev
Question Answering on SQuAD1.1 dev
Metric: EM (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
EM (best first)
EM (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
EM
▼
Extra Data
Paper
Date
↕
Code
1
T5-11B
90.06
Yes
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
2
LUKE
89.8
Yes
LUKE: Deep Contextualized Entity Representations...
2020-10-02
Code
3
XLNet+DSC
89.79
Yes
Dice Loss for Data-imbalanced NLP Tasks
2019-11-07
Code
4
XLNet (single model)
89.7
Yes
XLNet: Generalized Autoregressive Pretraining fo...
2019-06-19
Code
5
T5-3B
88.53
Yes
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
6
T5-Large 770M
86.66
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
7
BERT-LARGE (Ensemble+TriviaQA)
86.2
No
BERT: Pre-training of Deep Bidirectional Transfo...
2018-10-11
Code
8
T5-Base
85.44
Yes
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
9
BERT-LARGE (Single+TriviaQA)
84.2
No
BERT: Pre-training of Deep Bidirectional Transfo...
2018-10-11
Code
10
BERT-Large-uncased-PruneOFA (90% unstruct sparse)
83.35
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
11
BERT-Large-uncased-PruneOFA (90% unstruct sparse, QAT Int8)
83.22
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
12
BERT-Base-uncased-PruneOFA (85% unstruct sparse)
81.1
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
13
BERT-Base-uncased-PruneOFA (85% unstruct sparse, QAT Int8)
80.84
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
14
BERT-Base-uncased-PruneOFA (90% unstruct sparse)
79.83
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
15
TinyBERT-6 67M
79.7
No
TinyBERT: Distilling BERT for Natural Language U...
2019-09-23
Code
16
T5-Small
79.1
Yes
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
17
R.M-Reader (single)
78.9
No
Reinforced Mnemonic Reader for Machine Reading C...
2017-05-08
Code
18
DensePhrases
78.3
No
Learning Dense Representations of Phrases at Scale
2020-12-23
Code
19
DistilBERT-uncased-PruneOFA (85% unstruct sparse)
78.1
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
20
DistilBERT
77.7
No
DistilBERT, a distilled version of BERT: smaller...
2019-10-02
Code
21
DistilBERT-uncased-PruneOFA (85% unstruct sparse, QAT Int8)
77.03
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
22
DistilBERT-uncased-PruneOFA (90% unstruct sparse)
76.91
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
23
KAR
76.7
No
Explicit Utilization of General Knowledge in Mac...
2018-09-10
-
24
SAN (single)
76.235
No
Stochastic Answer Networks for Machine Reading C...
2017-12-10
Code
25
DistilBERT-uncased-PruneOFA (90% unstruct sparse, QAT Int8)
75.62
No
Prune Once for All: Sparse Pre-Trained Language ...
2021-11-10
Code
26
FusionNet
75.3
No
FusionNet: Fusing via Fully-Aware Attention with...
2017-11-16
Code
27
QANet (data aug x3)
75.1
No
QANet: Combining Local Convolution with Global S...
2018-04-23
Code
28
QANet (data aug x2)
74.5
No
QANet: Combining Local Convolution with Global S...
2018-04-23
Code
29
DCN+ (single)
74.5
No
DCN+: Mixed Objective and Deep Residual Coattent...
2017-10-31
Code
30
QANet
73.6
No
QANet: Combining Local Convolution with Global S...
2018-04-23
Code
31
PhaseCond (single)
72.1
No
Phase Conductor on Multi-layered Attentions for ...
2017-10-28
-
32
SRU
71.4
No
Simple Recurrent Units for Highly Parallelizable...
2017-09-08
Code
33
Smarnet
71.362
No
Smarnet: Teaching Machines to Read and Comprehen...
2017-10-08
-
34
DCN (Char + CoVe)
71.3
No
Learned in Translation: Contextualized Word Vect...
2017-08-01
Code
35
R-NET (single)
71.1
No
-
-
-
36
Ruminating Reader
70.6
No
Ruminating Reader: Reasoning with Gated Multi-Ho...
2017-04-24
-
37
FastQAExt (beam-size 5)
70.3
No
Making Neural QA as Simple as Possible but not S...
2017-03-14
Code
38
DrQA (Document Reader only)
69.5
No
Reading Wikipedia to Answer Open-Domain Questions
2017-03-31
Code
39
jNet (TreeLSTM adaptation, QTLa, K=100)
69.1
No
Exploring Question Understanding and Adaptation ...
2017-03-14
-
40
SEDT-LSTM
67.89
No
Structural Embedding of Syntactic Trees for Mach...
2017-03-02
-
41
BIDAF (single)
67.7
No
Bidirectional Attention Flow for Machine Compreh...
2016-11-05
Code
42
SECT-LSTM
67.65
No
Structural Embedding of Syntactic Trees for Mach...
2017-03-02
-
43
RASOR
66.4
No
Learning Recurrent Span Representations for Extr...
2016-11-04
Code
44
MPCM
66.1
No
Multi-Perspective Context Matching for Machine C...
2016-12-13
Code
45
DCN
65.4
No
Dynamic Coattention Networks For Question Answer...
2016-11-05
Code
46
FABIR
65.1
No
A Fully Attention-Based Information Retriever
2018-10-22
Code
47
Match-LSTM with Bi-Ans-Ptr (Boundary+Search+b)
64.1
No
Machine Comprehension Using Match-LSTM and Answe...
2016-08-29
Code
48
OTF dict+spelling (single)
63.06
No
Learning to Compute Word Embeddings On the Fly
2017-06-01
-
49
DCR
62.5
No
End-to-End Answer Chunk Extraction and Ranking f...
2016-10-31
-
50
FG fine-grained gate
59.95
No
Words or Characters? Fine-grained Gating for Rea...
2016-11-06
Code
51
SPARTA
59.3
No
SPARTA: Efficient Open-Domain Question Answering...
2020-09-28
Code
52
Blended RAG
57.63
No
Blended RAG: Improving RAG (Retriever-Augmented ...
2024-03-22
Code
53
BERTserini
50.2
No
Data Augmentation for BERT Fine-Tuning in Open-D...
2019-04-14
-
54
BERTserini
38.6
No
End-to-End Open-Domain Question Answering with B...
2019-02-05
Code
#1
T5-11B
SOTA
90.06
EM
· Extra Data
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#2
LUKE
89.8
EM
· Extra Data
· 2020-10-02
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Code
#3
XLNet+DSC
89.79
EM
· Extra Data
· 2019-11-07
Dice Loss for Data-imbalanced NLP Tasks
Code
#4
XLNet (single model)
SOTA
89.7
EM
· Extra Data
· 2019-06-19
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Code
#5
T5-3B
88.53
EM
· Extra Data
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#6
T5-Large 770M
86.66
EM
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#7
BERT-LARGE (Ensemble+TriviaQA)
SOTA
86.2
EM
· 2018-10-11
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Code
#8
T5-Base
85.44
EM
· Extra Data
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#9
BERT-LARGE (Single+TriviaQA)
84.2
EM
· 2018-10-11
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Code
#10
BERT-Large-uncased-PruneOFA (90% unstruct sparse)
83.35
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#11
BERT-Large-uncased-PruneOFA (90% unstruct sparse, QAT Int8)
83.22
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#12
BERT-Base-uncased-PruneOFA (85% unstruct sparse)
81.1
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#13
BERT-Base-uncased-PruneOFA (85% unstruct sparse, QAT Int8)
80.84
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#14
BERT-Base-uncased-PruneOFA (90% unstruct sparse)
79.83
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#15
TinyBERT-6 67M
79.7
EM
· 2019-09-23
TinyBERT: Distilling BERT for Natural Language Understanding
Code
#16
T5-Small
79.1
EM
· Extra Data
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#17
R.M-Reader (single)
SOTA
78.9
EM
· 2017-05-08
Reinforced Mnemonic Reader for Machine Reading Comprehension
Code
#18
DensePhrases
78.3
EM
· 2020-12-23
Learning Dense Representations of Phrases at Scale
Code
#19
DistilBERT-uncased-PruneOFA (85% unstruct sparse)
78.1
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#20
DistilBERT
77.7
EM
· 2019-10-02
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Code
#21
DistilBERT-uncased-PruneOFA (85% unstruct sparse, QAT Int8)
77.03
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#22
DistilBERT-uncased-PruneOFA (90% unstruct sparse)
76.91
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#23
KAR
76.7
EM
· 2018-09-10
Explicit Utilization of General Knowledge in Machine Reading Comprehension
#24
SAN (single)
76.235
EM
· 2017-12-10
Stochastic Answer Networks for Machine Reading Comprehension
Code
#25
DistilBERT-uncased-PruneOFA (90% unstruct sparse, QAT Int8)
75.62
EM
· 2021-11-10
Prune Once for All: Sparse Pre-Trained Language Models
Code
#26
FusionNet
75.3
EM
· 2017-11-16
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
Code
#27
QANet (data aug x3)
75.1
EM
· 2018-04-23
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Code
#28
QANet (data aug x2)
74.5
EM
· 2018-04-23
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Code
#29
DCN+ (single)
74.5
EM
· 2017-10-31
DCN+: Mixed Objective and Deep Residual Coattention for Question Answering
Code
#30
QANet
73.6
EM
· 2018-04-23
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Code
#31
PhaseCond (single)
72.1
EM
· 2017-10-28
Phase Conductor on Multi-layered Attentions for Machine Comprehension
#32
SRU
71.4
EM
· 2017-09-08
Simple Recurrent Units for Highly Parallelizable Recurrence
Code
#33
Smarnet
71.362
EM
· 2017-10-08
Smarnet: Teaching Machines to Read and Comprehend Like Human
#34
DCN (Char + CoVe)
71.3
EM
· 2017-08-01
Learned in Translation: Contextualized Word Vectors
Code
#35
R-NET (single)
71.1
EM
No paper
#36
Ruminating Reader
SOTA
70.6
EM
· 2017-04-24
Ruminating Reader: Reasoning with Gated Multi-Hop Attention
#37
FastQAExt (beam-size 5)
SOTA
70.3
EM
· 2017-03-14
Making Neural QA as Simple as Possible but not Simpler
Code
#38
DrQA (Document Reader only)
69.5
EM
· 2017-03-31
Reading Wikipedia to Answer Open-Domain Questions
Code
#39
jNet (TreeLSTM adaptation, QTLa, K=100)
69.1
EM
· 2017-03-14
Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering
#40
SEDT-LSTM
SOTA
67.89
EM
· 2017-03-02
Structural Embedding of Syntactic Trees for Machine Comprehension
#41
BIDAF (single)
SOTA
67.7
EM
· 2016-11-05
Bidirectional Attention Flow for Machine Comprehension
Code
#42
SECT-LSTM
67.65
EM
· 2017-03-02
Structural Embedding of Syntactic Trees for Machine Comprehension
#43
RASOR
SOTA
66.4
EM
· 2016-11-04
Learning Recurrent Span Representations for Extractive Question Answering
Code
#44
MPCM
66.1
EM
· 2016-12-13
Multi-Perspective Context Matching for Machine Comprehension
Code
#45
DCN
65.4
EM
· 2016-11-05
Dynamic Coattention Networks For Question Answering
Code
#46
FABIR
65.1
EM
· 2018-10-22
A Fully Attention-Based Information Retriever
Code
#47
Match-LSTM with Bi-Ans-Ptr (Boundary+Search+b)
SOTA
64.1
EM
· 2016-08-29
Machine Comprehension Using Match-LSTM and Answer Pointer
Code
#48
OTF dict+spelling (single)
63.06
EM
· 2017-06-01
Learning to Compute Word Embeddings On the Fly
#49
DCR
62.5
EM
· 2016-10-31
End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension
#50
FG fine-grained gate
59.95
EM
· 2016-11-06
Words or Characters? Fine-grained Gating for Reading Comprehension
Code
#51
SPARTA
59.3
EM
· 2020-09-28
SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval
Code
#52
Blended RAG
57.63
EM
· 2024-03-22
Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers
Code
#53
BERTserini
50.2
EM
· 2019-04-14
Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering
#54
BERTserini
38.6
EM
· 2019-02-05
End-to-End Open-Domain Question Answering with BERTserini
Code