TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Classification is a Strong Baseline for Deep Metric Learning

Classification is a Strong Baseline for Deep Metric Learning

Andrew Zhai, Hao-Yu Wu

2018-11-30BinarizationContent-Based Image RetrievalFace VerificationMetric LearningClusteringGeneral ClassificationRetrievalClassificationImage Retrieval
PaperPDFCodeCode(official)

Abstract

Deep metric learning aims to learn a function mapping image pixels to embedding feature vectors that model the similarity between images. Two major applications of metric learning are content-based image retrieval and face verification. For the retrieval tasks, the majority of current state-of-the-art (SOTA) approaches are triplet-based non-parametric training. For the face verification tasks, however, recent SOTA approaches have adopted classification-based parametric training. In this paper, we look into the effectiveness of classification based approaches on image retrieval datasets. We evaluate on several standard retrieval datasets such as CAR-196, CUB-200-2011, Stanford Online Product, and In-Shop datasets for image retrieval and clustering, and establish that our classification-based approach is competitive across different feature dimensions and base feature networks. We further provide insights into the performance effects of subsampling classes for scalable classification-based training, and the effects of binarization, enabling efficient storage and computation for practical applications.

Results

TaskDatasetMetricValueModel
Image RetrievalCARS196R@189.3NormSoftmax2048 (ResNet-50)
Image RetrievalSOPR@179.5NormSoftmax2048 (ResNet-50)
Image RetrievalIn-ShopR@189.4NormSoftmax2048 (ResNet-50)
Image RetrievalCUB-200-2011R@165.3NormSoftmax2048 (ResNet-50)

Related Papers

ProxyFusion: Face Feature Aggregation Through Sparse Experts2025-09-24Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18DiffClean: Diffusion-based Makeup Removal for Accurate Age Estimation2025-07-17Unsupervised Ground Metric Learning2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17