Gleb Mezentsev, Danil Gusak, Ivan Oseledets, Evgeny Frolov
Scalability issue plays a crucial role in productionizing modern recommender systems. Even lightweight architectures may suffer from high computational overload due to intermediate calculations, limiting their practicality in real-world applications. Specifically, applying full Cross-Entropy (CE) loss often yields state-of-the-art performance in terms of recommendations quality. Still, it suffers from excessive GPU memory utilization when dealing with large item catalogs. This paper introduces a novel Scalable Cross-Entropy (SCE) loss function in the sequential learning setup. It approximates the CE loss for datasets with large-size catalogs, enhancing both time efficiency and memory usage without compromising recommendations quality. Unlike traditional negative sampling methods, our approach utilizes a selective GPU-efficient computation strategy, focusing on the most informative elements of the catalog, particularly those most likely to be false positives. This is achieved by approximating the softmax distribution over a subset of the model outputs through the maximum inner product search. Experimental results on multiple datasets demonstrate the effectiveness of SCE in reducing peak memory usage by a factor of up to 100 compared to the alternatives, retaining or even exceeding their metrics values. The proposed approach also opens new perspectives for large-scale developments in different domains, such as large language models.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Recommendation Systems | Amazon Beauty | HR@10 | 0.0935 | SASRec-SCE |
| Recommendation Systems | Amazon Beauty | NDCG@10 | 0.0544 | SASRec-SCE |
| Recommendation Systems | Gowalla | COV@1 | 0.0304 | SASRec-SCE |
| Recommendation Systems | Gowalla | COV@10 | 0.219 | SASRec-SCE |
| Recommendation Systems | Gowalla | COV@5 | 0.126 | SASRec-SCE |
| Recommendation Systems | Gowalla | HR@10 | 0.0831 | SASRec-SCE |
| Recommendation Systems | Gowalla | HR@5 | 0.0574 | SASRec-SCE |
| Recommendation Systems | Gowalla | NDCG@1 | 0.0207 | SASRec-SCE |
| Recommendation Systems | Gowalla | NDCG@10 | 0.0476 | SASRec-SCE |
| Recommendation Systems | Gowalla | NDCG@5 | 0.0393 | SASRec-SCE |
| Recommendation Systems | Behance | COV@1 | 0.0393 | SASRec-SCE |
| Recommendation Systems | Behance | COV@10 | 0.25 | SASRec-SCE |
| Recommendation Systems | Behance | COV@5 | 15.3 | SASRec-SCE |
| Recommendation Systems | Behance | HR@10 | 0.113 | SASRec-SCE |
| Recommendation Systems | Behance | HR@5 | 0.0853 | SASRec-SCE |
| Recommendation Systems | Behance | NDCG@1 | 0.0277 | SASRec-SCE |
| Recommendation Systems | Behance | NDCG@10 | 0.0663 | SASRec-SCE |
| Recommendation Systems | Behance | NDCG@5 | 0.0572 | SASRec-SCE |