Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
HotpotQA
Question Answering on HotpotQA
Metric: ANS-EM (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
ANS-EM (best first)
ANS-EM (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
ANS-EM
▼
Extra Data
Paper
Date
↕
Code
1
Beam Retrieval
0.727
No
End-to-End Beam Retrieval for Multi-Hop Question...
2023-08-17
Code
2
AISO
0.675
No
Adaptive Information Seeking for Open-Domain Que...
2021-09-14
Code
3
Chain-of-Skills
0.674
No
Chain-of-Skills: A Configurable Model for Open-d...
2023-05-04
Code
4
HopRetriever + Sp-search
0.671
No
HopRetriever: Retrieve Hops over Wikipedia to An...
2020-12-31
-
5
HopRetriever
0.671
No
-
-
-
6
TPRR
0.67
No
-
-
-
7
IRRR+
0.663
No
Answering Open-Domain Questions of Varying Reaso...
2020-10-23
Code
8
EBS-Large
0.662
No
-
-
-
9
IRRR
0.657
No
Answering Open-Domain Questions of Varying Reaso...
2020-10-23
Code
10
EBS-SH
0.655
No
-
-
-
11
HopRetriever-V2
0.648
No
-
-
-
12
AFSGraph-retriever
0.646
No
-
-
-
13
Step-by-Step Retriever
0.63
No
-
-
-
14
DDRQA
0.625
No
Answering Any-hop Open-domain Questions with Ite...
2020-09-16
-
15
Recursive Dense Retriever
0.623
No
Answering Complex Open-Domain Questions with Mul...
2020-09-27
Code
16
DR model large
0.62
No
-
-
-
17
Model name
0.617
No
-
-
-
18
HopAns
0.617
No
-
-
-
19
Multi-dimensional-AFSGraph
0.615
No
-
-
-
20
HopRetriever-V1
0.608
No
-
-
-
21
Anonymous
0.604
No
-
-
-
22
Tree-shaped-cluster
0.603
No
-
-
-
23
AFSgraph
0.601
No
-
-
-
24
AFSgraph model
0.601
No
-
-
-
25
Robustly Fine-tuned Graph-based Recurrent Retriever
0.6
No
Learning to Retrieve Reasoning Paths over Wikipe...
2019-11-24
Code
26
RoBERTa-DenseRetriever-Fast
0.598
No
-
-
-
27
DPR-recurrent
0.598
No
-
-
-
28
HGN-albert + SemanticRetrievalMRS IR
0.597
No
-
-
-
29
RoBERTa-DenseRetriever
0.596
No
-
-
-
30
SAFSR model
0.589
No
HotpotQA: A Dataset for Diverse, Explainable Mul...
2018-09-25
Code
31
DR model
0.588
No
-
-
-
32
GraphRR-Fast
0.582
No
-
-
-
33
PromptRank-fewshot-2-demo
0.581
No
-
-
-
34
graph-recurrent-retriever+roberta-base w. S/R-pretraining
0.581
No
-
-
-
35
HGN-large + SemanticRetrievalMRS IR
0.579
No
-
-
-
36
HGN + SemanticRetrievalMRS IR
0.567
No
Hierarchical Graph Network for Multi-hop Questio...
2019-11-09
Code
37
Graph-based Recurrent Retriever
0.56
No
-
-
-
38
Quark + SemanticRetrievalMRS IR
0.555
No
A Simple Yet Strong Pipeline for HotpotQA
2020-04-14
-
39
MIR+EPS+BERT
0.529
No
-
-
-
40
GAR-BERT
0.523
No
-
-
-
41
Transformer-XH-final
0.516
No
-
-
Code
42
Transformer-XH
0.49
No
-
-
-
43
GAR
0.482
No
-
-
-
44
GAR-NOSF
0.475
No
-
-
-
45
SemanticRetrievalMRS
0.453
No
Revealing the Importance of Semantic Retrieval f...
2019-09-17
Code
46
PR-Bert
0.433
No
-
-
-
47
DrKIT
0.421
No
-
-
-
48
Entity-centric BERT Pipeline
0.418
No
-
-
-
49
SAFSr-Bert
0.394
No
-
-
-
50
GoldEn Retriever
0.379
No
Answering Complex Open-domain Questions Through ...
2019-10-15
Code
51
Cognitive Graph QA
0.371
No
Cognitive Graph for Multi-Hop Reading Comprehens...
2019-05-14
Code
52
AnonymousQ
0.369
No
-
-
-
53
TPReasoner w/o BERT
0.36
No
-
-
-
54
IKFGraph
0.358
No
-
-
-
55
Entity-centric IR
0.354
No
-
-
-
56
HGN Model-reproduce
0.335
No
-
-
-
57
MultiQA
0.307
No
-
-
-
58
MUPPET
0.306
No
Multi-Hop Paragraph Retrieval for Open-Domain Qu...
2019-06-15
Code
59
DecompRC
0.3
No
Multi-hop Reading Comprehension through Question...
2019-06-07
Code
60
0.3
No
-
-
-
61
GRN + BERT
0.299
No
-
-
-
62
SAFSr_model
0.289
No
-
-
-
63
QFE
0.287
No
Answering while Summarizing: Multi-task Learning...
2019-05-21
-
64
SAQA
0.284
No
-
-
-
65
KGNN
0.277
No
Multi-Paragraph Reasoning with Knowledge-enhance...
2019-11-06
-
66
GRN
0.273
No
-
-
-
67
Baseline Model
0.24
No
HotpotQA: A Dataset for Diverse, Explainable Mul...
2018-09-25
Code
68
SuppBERT
0.236
No
-
-
-
69
Mistral multi hop with very large sources
0.08
No
-
-
-
70
tes
0.074
No
-
-
-
#1
Beam Retrieval
SOTA
0.727
ANS-EM
· 2023-08-17
End-to-End Beam Retrieval for Multi-Hop Question Answering
Code
#2
AISO
SOTA
0.675
ANS-EM
· 2021-09-14
Adaptive Information Seeking for Open-Domain Question Answering
Code
#3
Chain-of-Skills
0.674
ANS-EM
· 2023-05-04
Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Code
#4
HopRetriever + Sp-search
SOTA
0.671
ANS-EM
· 2020-12-31
HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions
#5
HopRetriever
0.671
ANS-EM
No paper
#6
TPRR
0.67
ANS-EM
No paper
#7
IRRR+
SOTA
0.663
ANS-EM
· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text
Code
#8
EBS-Large
0.662
ANS-EM
No paper
#9
IRRR
0.657
ANS-EM
· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text
Code
#10
EBS-SH
0.655
ANS-EM
No paper
#11
HopRetriever-V2
0.648
ANS-EM
No paper
#12
AFSGraph-retriever
0.646
ANS-EM
No paper
#13
Step-by-Step Retriever
0.63
ANS-EM
No paper
#14
DDRQA
SOTA
0.625
ANS-EM
· 2020-09-16
Answering Any-hop Open-domain Questions with Iterative Document Reranking
#15
Recursive Dense Retriever
0.623
ANS-EM
· 2020-09-27
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
Code
#16
DR model large
0.62
ANS-EM
No paper
#17
Model name
0.617
ANS-EM
No paper
#18
HopAns
0.617
ANS-EM
No paper
#19
Multi-dimensional-AFSGraph
0.615
ANS-EM
No paper
#20
HopRetriever-V1
0.608
ANS-EM
No paper
#21
Anonymous
0.604
ANS-EM
No paper
#22
Tree-shaped-cluster
0.603
ANS-EM
No paper
#23
AFSgraph
0.601
ANS-EM
No paper
#24
AFSgraph model
0.601
ANS-EM
No paper
#25
Robustly Fine-tuned Graph-based Recurrent Retriever
SOTA
0.6
ANS-EM
· 2019-11-24
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
Code
#26
RoBERTa-DenseRetriever-Fast
0.598
ANS-EM
No paper
#27
DPR-recurrent
0.598
ANS-EM
No paper
#28
HGN-albert + SemanticRetrievalMRS IR
0.597
ANS-EM
No paper
#29
RoBERTa-DenseRetriever
0.596
ANS-EM
No paper
#30
SAFSR model
SOTA
0.589
ANS-EM
· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Code
#31
DR model
0.588
ANS-EM
No paper
#32
GraphRR-Fast
0.582
ANS-EM
No paper
#33
PromptRank-fewshot-2-demo
0.581
ANS-EM
No paper
#34
graph-recurrent-retriever+roberta-base w. S/R-pretraining
0.581
ANS-EM
No paper
#35
HGN-large + SemanticRetrievalMRS IR
0.579
ANS-EM
No paper
#36
HGN + SemanticRetrievalMRS IR
0.567
ANS-EM
· 2019-11-09
Hierarchical Graph Network for Multi-hop Question Answering
Code
#37
Graph-based Recurrent Retriever
0.56
ANS-EM
No paper
#38
Quark + SemanticRetrievalMRS IR
0.555
ANS-EM
· 2020-04-14
A Simple Yet Strong Pipeline for HotpotQA
#39
MIR+EPS+BERT
0.529
ANS-EM
No paper
#40
GAR-BERT
0.523
ANS-EM
No paper
#41
Transformer-XH-final
0.516
ANS-EM
No paper
Code
#42
Transformer-XH
0.49
ANS-EM
No paper
#43
GAR
0.482
ANS-EM
No paper
#44
GAR-NOSF
0.475
ANS-EM
No paper
#45
SemanticRetrievalMRS
0.453
ANS-EM
· 2019-09-17
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale
Code
#46
PR-Bert
0.433
ANS-EM
No paper
#47
DrKIT
0.421
ANS-EM
No paper
#48
Entity-centric BERT Pipeline
0.418
ANS-EM
No paper
#49
SAFSr-Bert
0.394
ANS-EM
No paper
#50
GoldEn Retriever
0.379
ANS-EM
· 2019-10-15
Answering Complex Open-domain Questions Through Iterative Query Generation
Code
#51
Cognitive Graph QA
0.371
ANS-EM
· 2019-05-14
Cognitive Graph for Multi-Hop Reading Comprehension at Scale
Code
#52
AnonymousQ
0.369
ANS-EM
No paper
#53
TPReasoner w/o BERT
0.36
ANS-EM
No paper
#54
IKFGraph
0.358
ANS-EM
No paper
#55
Entity-centric IR
0.354
ANS-EM
No paper
#56
HGN Model-reproduce
0.335
ANS-EM
No paper
#57
MultiQA
0.307
ANS-EM
No paper
#58
MUPPET
0.306
ANS-EM
· 2019-06-15
Multi-Hop Paragraph Retrieval for Open-Domain Question Answering
Code
#59
DecompRC
0.3
ANS-EM
· 2019-06-07
Multi-hop Reading Comprehension through Question Decomposition and Rescoring
Code
#60
0.3
ANS-EM
No paper
#61
GRN + BERT
0.299
ANS-EM
No paper
#62
SAFSr_model
0.289
ANS-EM
No paper
#63
QFE
0.287
ANS-EM
· 2019-05-21
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
#64
SAQA
0.284
ANS-EM
No paper
#65
KGNN
0.277
ANS-EM
· 2019-11-06
Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network
#66
GRN
0.273
ANS-EM
No paper
#67
Baseline Model
0.24
ANS-EM
· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Code
#68
SuppBERT
0.236
ANS-EM
No paper
#69
Mistral multi hop with very large sources
0.08
ANS-EM
No paper
#70
tes
0.074
ANS-EM
No paper