TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/No Parameter Left Behind: How Distillation and Model Size ...

No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval

Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

2022-06-06Question AnsweringNews RetrievalDuplicate-Question RetrievalArgument RetrievalFact CheckingEntity RetrievalTweet RetrievalInformation RetrievalBiomedical Information RetrievalRetrievalCitation Prediction
PaperPDFCode(official)

Abstract

Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of parameters and early query-document interaction play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that rerankers largely outperform dense ones of similar size in several tasks. Our largest reranker reaches the state of the art in 12 of the 18 datasets of the Benchmark-IR (BEIR) and surpasses the previous state of the art by 3 average points. Finally, we confirm that in-domain effectiveness is not a good indicator of zero-shot effectiveness. Code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git

Results

TaskDatasetMetricValueModel
Question AnsweringHotpotQA (BEIR)nDCG@100.759monoT5-3B
Question AnsweringNQ (BEIR)nDCG@100.633monoT5-3B
Question AnsweringFiQA-2018 (BEIR)nDCG@100.513monoT5-3B
Biomedical Information RetrievalNFCorpus (BEIR)nDCG@100.383monoT5-3B
Biomedical Information RetrievalBioASQ (BEIR)nDCG@100.579monoT5-3B
Biomedical Information RetrievalTREC-COVID (BEIR)nDCG@100.795monoT5-3B
Fact CheckingCLIMATE-FEVER (BEIR)nDCG@100.28monoT5-3B
Fact CheckingFEVER (BEIR)nDCG@100.849monoT5-3B
Fact CheckingSciFact (BEIR)nDCG@100.777monoT5-3B

Related Papers

PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants2025-07-21From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17