Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
HotpotQA
Question Answering on HotpotQA
Metric: ANS-F1 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
ANS-F1 (best first)
ANS-F1 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
ANS-F1
▼
Extra Data
Paper
Date
↕
Code
1
Beam Retrieval
0.85
No
End-to-End Beam Retrieval for Multi-Hop Question...
2023-08-17
Code
2
AISO
0.805
No
Adaptive Information Seeking for Open-Domain Que...
2021-09-14
Code
3
Chain-of-Skills
0.801
No
Chain-of-Skills: A Configurable Model for Open-d...
2023-05-04
Code
4
HopRetriever + Sp-search
0.799
No
HopRetriever: Retrieve Hops over Wikipedia to An...
2020-12-31
-
5
HopRetriever
0.799
No
-
-
-
6
TPRR
0.795
No
-
-
-
7
EBS-Large
0.793
No
-
-
-
8
IRRR+
0.791
No
Answering Open-Domain Questions of Varying Reaso...
2020-10-23
Code
9
EBS-SH
0.786
No
-
-
-
10
IRRR
0.782
No
Answering Open-Domain Questions of Varying Reaso...
2020-10-23
Code
11
HopRetriever-V2
0.778
No
-
-
-
12
AFSGraph-retriever
0.778
No
-
-
-
13
DDRQA
0.759
No
Answering Any-hop Open-domain Questions with Ite...
2020-09-16
-
14
BigBird-etc
0.755
No
Big Bird: Transformers for Longer Sequences
2020-07-28
Code
15
Step-by-Step Retriever
0.754
No
-
-
-
16
Recursive Dense Retriever
0.753
No
Answering Complex Open-Domain Questions with Mul...
2020-09-27
Code
17
DR model large
0.753
No
-
-
-
18
Model name
0.746
No
-
-
-
19
HopAns
0.746
No
-
-
-
20
Multi-dimensional-AFSGraph
0.746
No
-
-
-
21
HopRetriever-V1
0.739
No
-
-
-
22
Anonymous
0.732
No
-
-
-
23
Tree-shaped-cluster
0.731
No
-
-
-
24
AFSgraph
0.73
No
-
-
-
25
Robustly Fine-tuned Graph-based Recurrent Retriever
0.73
No
Learning to Retrieve Reasoning Paths over Wikipe...
2019-11-24
Code
26
AFSgraph model
0.73
No
-
-
-
27
RoBERTa-DenseRetriever-Fast
0.727
No
-
-
-
28
DPR-recurrent
0.727
No
-
-
-
29
RoBERTa-DenseRetriever
0.724
No
-
-
-
30
DR model
0.717
No
-
-
-
31
SAFSR model
0.716
No
HotpotQA: A Dataset for Diverse, Explainable Mul...
2018-09-25
Code
32
HGN-albert + SemanticRetrievalMRS IR
0.714
No
-
-
-
33
PromptRank-fewshot-2-demo
0.711
No
-
-
-
34
graph-recurrent-retriever+roberta-base w. S/R-pretraining
0.71
No
-
-
-
35
GraphRR-Fast
0.709
No
-
-
-
36
HGN-large + SemanticRetrievalMRS IR
0.699
No
-
-
-
37
HGN + SemanticRetrievalMRS IR
0.692
No
Hierarchical Graph Network for Multi-hop Questio...
2019-11-09
Code
38
Graph-based Recurrent Retriever
0.689
No
-
-
-
39
Quark + SemanticRetrievalMRS IR
0.675
No
A Simple Yet Strong Pipeline for HotpotQA
2020-04-14
-
40
GAR-BERT
0.648
No
-
-
-
41
MIR+EPS+BERT
0.648
No
-
-
-
42
Transformer-XH-final
0.641
No
-
-
Code
43
GAR
0.613
No
-
-
-
44
Transformer-XH
0.608
No
-
-
-
45
GAR-NOSF
0.606
No
-
-
-
46
SemanticRetrievalMRS
0.573
No
Revealing the Importance of Semantic Retrieval f...
2019-09-17
Code
47
PR-Bert
0.538
No
-
-
-
48
Entity-centric BERT Pipeline
0.531
No
-
-
-
49
DrKIT
0.517
No
-
-
-
50
SAFSr-Bert
0.514
No
-
-
-
51
Cognitive Graph QA
0.489
No
Cognitive Graph for Multi-Hop Reading Comprehens...
2019-05-14
Code
52
GoldEn Retriever
0.486
No
Answering Complex Open-domain Questions Through ...
2019-10-15
Code
53
TPReasoner w/o BERT
0.474
No
-
-
-
54
Entity-centric IR
0.463
No
-
-
-
55
AnonymousQ
0.46
No
-
-
-
56
IKFGraph
0.453
No
-
-
-
57
HGN Model-reproduce
0.427
No
-
-
-
58
DecompRC
0.407
No
Multi-hop Reading Comprehension through Question...
2019-06-07
Code
59
0.407
No
-
-
-
60
MUPPET
0.403
No
Multi-Hop Paragraph Retrieval for Open-Domain Qu...
2019-06-15
Code
61
MultiQA
0.402
No
-
-
-
62
GRN + BERT
0.391
No
-
-
-
63
SAFSr_model
0.391
No
-
-
-
64
SAQA
0.386
No
-
-
-
65
QFE
0.381
No
Answering while Summarizing: Multi-task Learning...
2019-05-21
-
66
KGNN
0.372
No
Multi-Paragraph Reasoning with Knowledge-enhance...
2019-11-06
-
67
GRN
0.365
No
-
-
-
68
Baseline Model
0.329
No
HotpotQA: A Dataset for Diverse, Explainable Mul...
2018-09-25
Code
69
SuppBERT
0.32
No
-
-
-
70
Mistral multi hop with very large sources
0.221
No
-
-
-
71
tes
0.121
No
-
-
-
#1
Beam Retrieval
SOTA
0.85
ANS-F1
· 2023-08-17
End-to-End Beam Retrieval for Multi-Hop Question Answering
Code
#2
AISO
SOTA
0.805
ANS-F1
· 2021-09-14
Adaptive Information Seeking for Open-Domain Question Answering
Code
#3
Chain-of-Skills
0.801
ANS-F1
· 2023-05-04
Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Code
#4
HopRetriever + Sp-search
SOTA
0.799
ANS-F1
· 2020-12-31
HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions
#5
HopRetriever
0.799
ANS-F1
No paper
#6
TPRR
0.795
ANS-F1
No paper
#7
EBS-Large
0.793
ANS-F1
No paper
#8
IRRR+
SOTA
0.791
ANS-F1
· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text
Code
#9
EBS-SH
0.786
ANS-F1
No paper
#10
IRRR
0.782
ANS-F1
· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text
Code
#11
HopRetriever-V2
0.778
ANS-F1
No paper
#12
AFSGraph-retriever
0.778
ANS-F1
No paper
#13
DDRQA
SOTA
0.759
ANS-F1
· 2020-09-16
Answering Any-hop Open-domain Questions with Iterative Document Reranking
#14
BigBird-etc
SOTA
0.755
ANS-F1
· 2020-07-28
Big Bird: Transformers for Longer Sequences
Code
#15
Step-by-Step Retriever
0.754
ANS-F1
No paper
#16
Recursive Dense Retriever
0.753
ANS-F1
· 2020-09-27
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
Code
#17
DR model large
0.753
ANS-F1
No paper
#18
Model name
0.746
ANS-F1
No paper
#19
HopAns
0.746
ANS-F1
No paper
#20
Multi-dimensional-AFSGraph
0.746
ANS-F1
No paper
#21
HopRetriever-V1
0.739
ANS-F1
No paper
#22
Anonymous
0.732
ANS-F1
No paper
#23
Tree-shaped-cluster
0.731
ANS-F1
No paper
#24
AFSgraph
0.73
ANS-F1
No paper
#25
Robustly Fine-tuned Graph-based Recurrent Retriever
SOTA
0.73
ANS-F1
· 2019-11-24
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
Code
#26
AFSgraph model
0.73
ANS-F1
No paper
#27
RoBERTa-DenseRetriever-Fast
0.727
ANS-F1
No paper
#28
DPR-recurrent
0.727
ANS-F1
No paper
#29
RoBERTa-DenseRetriever
0.724
ANS-F1
No paper
#30
DR model
0.717
ANS-F1
No paper
#31
SAFSR model
SOTA
0.716
ANS-F1
· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Code
#32
HGN-albert + SemanticRetrievalMRS IR
0.714
ANS-F1
No paper
#33
PromptRank-fewshot-2-demo
0.711
ANS-F1
No paper
#34
graph-recurrent-retriever+roberta-base w. S/R-pretraining
0.71
ANS-F1
No paper
#35
GraphRR-Fast
0.709
ANS-F1
No paper
#36
HGN-large + SemanticRetrievalMRS IR
0.699
ANS-F1
No paper
#37
HGN + SemanticRetrievalMRS IR
0.692
ANS-F1
· 2019-11-09
Hierarchical Graph Network for Multi-hop Question Answering
Code
#38
Graph-based Recurrent Retriever
0.689
ANS-F1
No paper
#39
Quark + SemanticRetrievalMRS IR
0.675
ANS-F1
· 2020-04-14
A Simple Yet Strong Pipeline for HotpotQA
#40
GAR-BERT
0.648
ANS-F1
No paper
#41
MIR+EPS+BERT
0.648
ANS-F1
No paper
#42
Transformer-XH-final
0.641
ANS-F1
No paper
Code
#43
GAR
0.613
ANS-F1
No paper
#44
Transformer-XH
0.608
ANS-F1
No paper
#45
GAR-NOSF
0.606
ANS-F1
No paper
#46
SemanticRetrievalMRS
0.573
ANS-F1
· 2019-09-17
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale
Code
#47
PR-Bert
0.538
ANS-F1
No paper
#48
Entity-centric BERT Pipeline
0.531
ANS-F1
No paper
#49
DrKIT
0.517
ANS-F1
No paper
#50
SAFSr-Bert
0.514
ANS-F1
No paper
#51
Cognitive Graph QA
0.489
ANS-F1
· 2019-05-14
Cognitive Graph for Multi-Hop Reading Comprehension at Scale
Code
#52
GoldEn Retriever
0.486
ANS-F1
· 2019-10-15
Answering Complex Open-domain Questions Through Iterative Query Generation
Code
#53
TPReasoner w/o BERT
0.474
ANS-F1
No paper
#54
Entity-centric IR
0.463
ANS-F1
No paper
#55
AnonymousQ
0.46
ANS-F1
No paper
#56
IKFGraph
0.453
ANS-F1
No paper
#57
HGN Model-reproduce
0.427
ANS-F1
No paper
#58
DecompRC
0.407
ANS-F1
· 2019-06-07
Multi-hop Reading Comprehension through Question Decomposition and Rescoring
Code
#59
0.407
ANS-F1
No paper
#60
MUPPET
0.403
ANS-F1
· 2019-06-15
Multi-Hop Paragraph Retrieval for Open-Domain Question Answering
Code
#61
MultiQA
0.402
ANS-F1
No paper
#62
GRN + BERT
0.391
ANS-F1
No paper
#63
SAFSr_model
0.391
ANS-F1
No paper
#64
SAQA
0.386
ANS-F1
No paper
#65
QFE
0.381
ANS-F1
· 2019-05-21
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
#66
KGNN
0.372
ANS-F1
· 2019-11-06
Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network
#67
GRN
0.365
ANS-F1
No paper
#68
Baseline Model
0.329
ANS-F1
· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Code
#69
SuppBERT
0.32
ANS-F1
No paper
#70
Mistral multi hop with very large sources
0.221
ANS-F1
No paper
#71
tes
0.121
ANS-F1
No paper