End-to-End Beam Retrieval for Multi-Hop Question Answering

Jiahao Zhang, Haiyang Zhang, Dongmei Zhang, Yong liu, Shen Huang

2023-08-17Question Answering Multi-hop Question Answering Large Language Model Retrieval Language Modelling

Paper PDF Code(official)Code(official)Code

Abstract

Multi-hop question answering (QA) involves finding multiple relevant passages and step-by-step reasoning to answer complex questions, indicating a retrieve-and-read paradigm. However, previous retrievers were customized for two-hop questions, and most of them were trained separately across different hops, resulting in a lack of supervision over the entire multi-hop retrieval process and leading to poor performance in complicated scenarios beyond two hops. In this work, we introduce Beam Retrieval, an end-to-end beam retrieval framework for multi-hop QA. This approach models the multi-hop retrieval process in an end-to-end manner by jointly optimizing an encoder and two classification heads across all hops. Moreover, Beam Retrieval maintains multiple partial hypotheses of relevant passages at each step, expanding the search space and reducing the risk of missing relevant passages. To establish a complete QA system, we incorporate a supervised reader or a large language model (LLM). Experimental results demonstrate that Beam Retrieval achieves a nearly 50% improvement compared with baselines on challenging MuSiQue-Ans, and it also surpasses all previous retrievers on HotpotQA and achieves 99.9% precision on 2WikiMultiHopQA. Providing high-quality context, Beam Retrieval helps our supervised reader achieve new state-of-the-art performance and substantially improves the few-shot QA performance of LLMs.

Results

Task	Dataset	Metric	Value	Model
Question Answering	HotpotQA	ANS-EM	0.727	Beam Retrieval
Question Answering	HotpotQA	ANS-F1	0.85	Beam Retrieval
Question Answering	HotpotQA	JOINT-EM	0.505	Beam Retrieval
Question Answering	HotpotQA	JOINT-F1	0.775	Beam Retrieval
Question Answering	HotpotQA	SUP-EM	0.663	Beam Retrieval
Question Answering	HotpotQA	SUP-F1	0.901	Beam Retrieval
Multi-hop Question Answering	MuSiQue-Ans	An	69.2	Beam Retrieval
Multi-hop Question Answering	MuSiQue-Ans	Sp	91.4	Beam Retrieval

End-to-End Beam Retrieval for Multi-Hop Question Answering

Abstract

Results

Related Papers

End-to-End Beam Retrieval for Multi-Hop Question Answering

Abstract

Results

Related Papers