TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Revealing the Importance of Semantic Retrieval for Machine...

Revealing the Importance of Semantic Retrieval for Machine Reading at Scale

Yixin Nie, Songhe Wang, Mohit Bansal

2019-09-17IJCNLP 2019 11Reading ComprehensionRepresentation LearningInformation RetrievalSemantic RetrievalRetrievalFact Verification
PaperPDFCode(official)Code

Abstract

Machine Reading at Scale (MRS) is a challenging task in which a system is given an input query and is asked to produce a precise output by "reading" information from a large knowledge base. The task has gained popularity with its natural combination of information retrieval (IR) and machine comprehension (MC). Advancements in representation learning have led to separated progress in both IR and MC; however, very few studies have examined the relationship and combined design of retrieval and comprehension at different levels of granularity, for development of MRS systems. In this work, we give general guidelines on system design for MRS by proposing a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task. The system is evaluated on both fact verification and open-domain multihop QA, achieving state-of-the-art results on the leaderboard test sets of both FEVER and HOTPOTQA. To further demonstrate the importance of semantic retrieval, we present ablation and analysis studies to quantify the contribution of neural retrieval modules at both paragraph-level and sentence-level, and illustrate that intermediate semantic retrieval modules are vital for not only effectively filtering upstream information and thus saving downstream computation, but also for shaping upstream data distribution and providing better data for downstream modeling. Code/data made publicly available at: https://github.com/easonnie/semanticRetrievalMRS

Results

TaskDatasetMetricValueModel
Question AnsweringHotpotQAANS-EM0.453SemanticRetrievalMRS
Question AnsweringHotpotQAANS-F10.573SemanticRetrievalMRS
Question AnsweringHotpotQAJOINT-EM0.251SemanticRetrievalMRS
Question AnsweringHotpotQAJOINT-F10.476SemanticRetrievalMRS
Question AnsweringHotpotQASUP-EM0.387SemanticRetrievalMRS
Question AnsweringHotpotQASUP-F10.708SemanticRetrievalMRS

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17