Retrieval with Learned Similarities

Bailu Ding, Jiaqi Zhai

2024-07-22Question Answering Retrieval Recommendation Systems

Abstract

Retrieval plays a fundamental role in recommendation systems, search, and natural language processing (NLP) by efficiently finding relevant items from a large corpus given a query. Dot products have been widely used as the similarity function in such tasks, enabled by Maximum Inner Product Search (MIPS) algorithms for efficient retrieval. However, state-of-the-art retrieval algorithms have migrated to learned similarities. These advanced approaches encompass multiple query embeddings, complex neural networks, direct item ID decoding via beam search, and hybrid solutions. Unfortunately, we lack efficient solutions for retrieval in these state-of-the-art setups. Our work addresses this gap by investigating efficient retrieval techniques with expressive learned similarity functions. We establish Mixture-of-Logits (MoL) as a universal approximator of similarity functions, demonstrate that MoL's expressiveness can be realized empirically to achieve superior performance on diverse retrieval scenarios, and propose techniques to retrieve the approximate top-k results using MoL with tight error bounds. Through extensive experimentation, we show that MoL, enhanced by our proposed mutual information-based load balancing loss, sets new state-of-the-art results across heterogeneous scenarios, including sequential retrieval models in recommendation systems and finetuning language models for question answering; and our approximate top-$k$ algorithms outperform baselines by up to 66x in latency while achieving >.99 recall rate compared to exact algorithms.

Results

Task	Dataset	Metric	Value	Model
Recommendation Systems	MovieLens 1M	HR@10 (full corpus)	0.3412	HSTU+MoL
Recommendation Systems	MovieLens 1M	NDCG@10 (full corpus)	0.1979	HSTU+MoL
Recommendation Systems	Amazon-Book	HR@10	0.0613	HSTU+MoL
Recommendation Systems	Amazon-Book	HR@50	0.1292	HSTU+MoL
Recommendation Systems	Amazon-Book	NDCG@10	0.035	HSTU+MoL
Recommendation Systems	Amazon-Book	NDCG@50	0.0498	HSTU+MoL

Retrieval with Learned Similarities

Abstract

Results

Related Papers

Retrieval with Learned Similarities

Abstract

Results

Related Papers