Question Answering on HotpotQA

Metric: ANS-EM (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	ANS-EM▼	Extra Data	Paper	Date↕	Code
1	Beam Retrieval	0.727	No	End-to-End Beam Retrieval for Multi-Hop Question...	2023-08-17	Code
2	AISO	0.675	No	Adaptive Information Seeking for Open-Domain Que...	2021-09-14	Code
3	Chain-of-Skills	0.674	No	Chain-of-Skills: A Configurable Model for Open-d...	2023-05-04	Code
4	HopRetriever + Sp-search	0.671	No	HopRetriever: Retrieve Hops over Wikipedia to An...	2020-12-31	-
5	HopRetriever	0.671	No	-	-	-
6	TPRR	0.67	No	-	-	-
7	IRRR+	0.663	No	Answering Open-Domain Questions of Varying Reaso...	2020-10-23	Code
8	EBS-Large	0.662	No	-	-	-
9	IRRR	0.657	No	Answering Open-Domain Questions of Varying Reaso...	2020-10-23	Code
10	EBS-SH	0.655	No	-	-	-
11	HopRetriever-V2	0.648	No	-	-	-
12	AFSGraph-retriever	0.646	No	-	-	-
13	Step-by-Step Retriever	0.63	No	-	-	-
14	DDRQA	0.625	No	Answering Any-hop Open-domain Questions with Ite...	2020-09-16	-
15	Recursive Dense Retriever	0.623	No	Answering Complex Open-Domain Questions with Mul...	2020-09-27	Code
16	DR model large	0.62	No	-	-	-
17	Model name	0.617	No	-	-	-
18	HopAns	0.617	No	-	-	-
19	Multi-dimensional-AFSGraph	0.615	No	-	-	-
20	HopRetriever-V1	0.608	No	-	-	-
21	Anonymous	0.604	No	-	-	-
22	Tree-shaped-cluster	0.603	No	-	-	-
23	AFSgraph	0.601	No	-	-	-
24	AFSgraph model	0.601	No	-	-	-
25	Robustly Fine-tuned Graph-based Recurrent Retriever	0.6	No	Learning to Retrieve Reasoning Paths over Wikipe...	2019-11-24	Code
26	RoBERTa-DenseRetriever-Fast	0.598	No	-	-	-
27	DPR-recurrent	0.598	No	-	-	-
28	HGN-albert + SemanticRetrievalMRS IR	0.597	No	-	-	-
29	RoBERTa-DenseRetriever	0.596	No	-	-	-
30	SAFSR model	0.589	No	HotpotQA: A Dataset for Diverse, Explainable Mul...	2018-09-25	Code
31	DR model	0.588	No	-	-	-
32	GraphRR-Fast	0.582	No	-	-	-
33	PromptRank-fewshot-2-demo	0.581	No	-	-	-
34	graph-recurrent-retriever+roberta-base w. S/R-pretraining	0.581	No	-	-	-
35	HGN-large + SemanticRetrievalMRS IR	0.579	No	-	-	-
36	HGN + SemanticRetrievalMRS IR	0.567	No	Hierarchical Graph Network for Multi-hop Questio...	2019-11-09	Code
37	Graph-based Recurrent Retriever	0.56	No	-	-	-
38	Quark + SemanticRetrievalMRS IR	0.555	No	A Simple Yet Strong Pipeline for HotpotQA	2020-04-14	-
39	MIR+EPS+BERT	0.529	No	-	-	-
40	GAR-BERT	0.523	No	-	-	-
41	Transformer-XH-final	0.516	No	-	-	Code
42	Transformer-XH	0.49	No	-	-	-
43	GAR	0.482	No	-	-	-
44	GAR-NOSF	0.475	No	-	-	-
45	SemanticRetrievalMRS	0.453	No	Revealing the Importance of Semantic Retrieval f...	2019-09-17	Code
46	PR-Bert	0.433	No	-	-	-
47	DrKIT	0.421	No	-	-	-
48	Entity-centric BERT Pipeline	0.418	No	-	-	-
49	SAFSr-Bert	0.394	No	-	-	-
50	GoldEn Retriever	0.379	No	Answering Complex Open-domain Questions Through ...	2019-10-15	Code
51	Cognitive Graph QA	0.371	No	Cognitive Graph for Multi-Hop Reading Comprehens...	2019-05-14	Code
52	AnonymousQ	0.369	No	-	-	-
53	TPReasoner w/o BERT	0.36	No	-	-	-
54	IKFGraph	0.358	No	-	-	-
55	Entity-centric IR	0.354	No	-	-	-
56	HGN Model-reproduce	0.335	No	-	-	-
57	MultiQA	0.307	No	-	-	-
58	MUPPET	0.306	No	Multi-Hop Paragraph Retrieval for Open-Domain Qu...	2019-06-15	Code
59	DecompRC	0.3	No	Multi-hop Reading Comprehension through Question...	2019-06-07	Code
60		0.3	No	-	-	-
61	GRN + BERT	0.299	No	-	-	-
62	SAFSr_model	0.289	No	-	-	-
63	QFE	0.287	No	Answering while Summarizing: Multi-task Learning...	2019-05-21	-
64	SAQA	0.284	No	-	-	-
65	KGNN	0.277	No	Multi-Paragraph Reasoning with Knowledge-enhance...	2019-11-06	-
66	GRN	0.273	No	-	-	-
67	Baseline Model	0.24	No	HotpotQA: A Dataset for Diverse, Explainable Mul...	2018-09-25	Code
68	SuppBERT	0.236	No	-	-	-
69	Mistral multi hop with very large sources	0.08	No	-	-	-
70	tes	0.074	No	-	-	-

#1Beam RetrievalSOTA
0.727
ANS-EM· 2023-08-17
End-to-End Beam Retrieval for Multi-Hop Question Answering Code
#2AISOSOTA
0.675
ANS-EM· 2021-09-14
Adaptive Information Seeking for Open-Domain Question Answering Code
#3Chain-of-Skills
0.674
ANS-EM· 2023-05-04
Chain-of-Skills: A Configurable Model for Open-domain Question Answering Code
#4HopRetriever + Sp-searchSOTA
0.671
ANS-EM· 2020-12-31
HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions
#5HopRetriever
0.671
ANS-EM
No paper
#6TPRR
0.67
ANS-EM
No paper
#7IRRR+SOTA
0.663
ANS-EM· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text Code
#8EBS-Large
0.662
ANS-EM
No paper
#9IRRR
0.657
ANS-EM· 2020-10-23
Answering Open-Domain Questions of Varying Reasoning Steps from Text Code
#10EBS-SH
0.655
ANS-EM
No paper
#11HopRetriever-V2
0.648
ANS-EM
No paper
#12AFSGraph-retriever
0.646
ANS-EM
No paper
#13Step-by-Step Retriever
0.63
ANS-EM
No paper
#14DDRQASOTA
0.625
ANS-EM· 2020-09-16
Answering Any-hop Open-domain Questions with Iterative Document Reranking
#15Recursive Dense Retriever
0.623
ANS-EM· 2020-09-27
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval Code
#16DR model large
0.62
ANS-EM
No paper
#17Model name
0.617
ANS-EM
No paper
#18HopAns
0.617
ANS-EM
No paper
#19Multi-dimensional-AFSGraph
0.615
ANS-EM
No paper
#20HopRetriever-V1
0.608
ANS-EM
No paper
#21Anonymous
0.604
ANS-EM
No paper
#22Tree-shaped-cluster
0.603
ANS-EM
No paper
#23AFSgraph
0.601
ANS-EM
No paper
#24AFSgraph model
0.601
ANS-EM
No paper
#25Robustly Fine-tuned Graph-based Recurrent RetrieverSOTA
0.6
ANS-EM· 2019-11-24
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering Code
#26RoBERTa-DenseRetriever-Fast
0.598
ANS-EM
No paper
#27DPR-recurrent
0.598
ANS-EM
No paper
#28HGN-albert + SemanticRetrievalMRS IR
0.597
ANS-EM
No paper
#29RoBERTa-DenseRetriever
0.596
ANS-EM
No paper
#30SAFSR modelSOTA
0.589
ANS-EM· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Code
#31DR model
0.588
ANS-EM
No paper
#32GraphRR-Fast
0.582
ANS-EM
No paper
#33PromptRank-fewshot-2-demo
0.581
ANS-EM
No paper
#34graph-recurrent-retriever+roberta-base w. S/R-pretraining
0.581
ANS-EM
No paper
#35HGN-large + SemanticRetrievalMRS IR
0.579
ANS-EM
No paper
#36HGN + SemanticRetrievalMRS IR
0.567
ANS-EM· 2019-11-09
Hierarchical Graph Network for Multi-hop Question Answering Code
#37Graph-based Recurrent Retriever
0.56
ANS-EM
No paper
#38Quark + SemanticRetrievalMRS IR
0.555
ANS-EM· 2020-04-14
A Simple Yet Strong Pipeline for HotpotQA
#39MIR+EPS+BERT
0.529
ANS-EM
No paper
#40GAR-BERT
0.523
ANS-EM
No paper
#41Transformer-XH-final
0.516
ANS-EM
No paperCode
#42Transformer-XH
0.49
ANS-EM
No paper
#43GAR
0.482
ANS-EM
No paper
#44GAR-NOSF
0.475
ANS-EM
No paper
#45SemanticRetrievalMRS
0.453
ANS-EM· 2019-09-17
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale Code
#46PR-Bert
0.433
ANS-EM
No paper
#47DrKIT
0.421
ANS-EM
No paper
#48Entity-centric BERT Pipeline
0.418
ANS-EM
No paper
#49SAFSr-Bert
0.394
ANS-EM
No paper
#50GoldEn Retriever
0.379
ANS-EM· 2019-10-15
Answering Complex Open-domain Questions Through Iterative Query Generation Code
#51Cognitive Graph QA
0.371
ANS-EM· 2019-05-14
Cognitive Graph for Multi-Hop Reading Comprehension at Scale Code
#52AnonymousQ
0.369
ANS-EM
No paper
#53TPReasoner w/o BERT
0.36
ANS-EM
No paper
#54IKFGraph
0.358
ANS-EM
No paper
#55Entity-centric IR
0.354
ANS-EM
No paper
#56HGN Model-reproduce
0.335
ANS-EM
No paper
#57MultiQA
0.307
ANS-EM
No paper
#58MUPPET
0.306
ANS-EM· 2019-06-15
Multi-Hop Paragraph Retrieval for Open-Domain Question Answering Code
#59DecompRC
0.3
ANS-EM· 2019-06-07
Multi-hop Reading Comprehension through Question Decomposition and Rescoring Code
#60
0.3
ANS-EM
No paper
#61GRN + BERT
0.299
ANS-EM
No paper
#62SAFSr_model
0.289
ANS-EM
No paper
#63QFE
0.287
ANS-EM· 2019-05-21
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
#64SAQA
0.284
ANS-EM
No paper
#65KGNN
0.277
ANS-EM· 2019-11-06
Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network
#66GRN
0.273
ANS-EM
No paper
#67Baseline Model
0.24
ANS-EM· 2018-09-25
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Code
#68SuppBERT
0.236
ANS-EM
No paper
#69Mistral multi hop with very large sources
0.08
ANS-EM
No paper
#70tes
0.074
ANS-EM
No paper