TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/On the Unreasonable Effectiveness of Centroids in Image Re...

On the Unreasonable Effectiveness of Centroids in Image Retrieval

Mikolaj Wieczorek, Barbara Rychalska, Jacek Dabrowski

2021-04-28Person Re-IdentificationRetrievalImage Retrieval
PaperPDFCodeCodeCode(official)

Abstract

Image retrieval task consists of finding similar images to a query image from a set of gallery (database) images. Such systems are used in various applications e.g. person re-identification (ReID) or visual product search. Despite active development of retrieval models it still remains a challenging task mainly due to large intra-class variance caused by changes in view angle, lighting, background clutter or occlusion, while inter-class variance may be relatively low. A large portion of current research focuses on creating more robust features and modifying objective functions, usually based on Triplet Loss. Some works experiment with using centroid/proxy representation of a class to alleviate problems with computing speed and hard samples mining used with Triplet Loss. However, these approaches are used for training alone and discarded during the retrieval stage. In this paper we propose to use the mean centroid representation both during training and retrieval. Such an aggregated representation is more robust to outliers and assures more stable features. As each class is represented by a single embedding - the class centroid - both retrieval time and storage requirements are reduced significantly. Aggregating multiple embeddings results in a significant reduction of the search space due to lowering the number of candidate target vectors, which makes the method especially suitable for production deployments. Comprehensive experiments conducted on two ReID and Fashion Retrieval datasets demonstrate effectiveness of our method, which outperforms the current state-of-the-art. We propose centroid training and retrieval as a viable method for both Fashion Retrieval and ReID applications.

Results

TaskDatasetMetricValueModel
Person Re-IdentificationDukeMTMC-reIDRank-195.6CTL Model (ResNet50, 256x128)
Person Re-IdentificationDukeMTMC-reIDRank-1097.9CTL Model (ResNet50, 256x128)
Person Re-IdentificationDukeMTMC-reIDRank-596.2CTL Model (ResNet50, 256x128)
Person Re-IdentificationDukeMTMC-reIDmAP96.1CTL Model (ResNet50, 256x128)
Image RetrievalDeepFashion - Consumer-to-shopRank-137.3CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalDeepFashion - Consumer-to-shopRank-1071.2CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalDeepFashion - Consumer-to-shopRank-2077.7CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalDeepFashion - Consumer-to-shopRank-5085CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalDeepFashion - Consumer-to-shopmAP49.2CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalDeepFashion - Consumer-to-shopRank-129.4CTL Model (ResNet50, 256x128)
Image RetrievalDeepFashion - Consumer-to-shopRank-1061.3CTL Model (ResNet50, 256x128)
Image RetrievalDeepFashion - Consumer-to-shopRank-2068.9CTL Model (ResNet50, 256x128)
Image RetrievalDeepFashion - Consumer-to-shopRank-5077.4CTL Model (ResNet50, 256x128)
Image RetrievalDeepFashion - Consumer-to-shopmAP40.4CTL Model (ResNet50, 256x128)
Image RetrievalExact Street2ShopRank-153.7CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalExact Street2ShopRank-1070.9CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalExact Street2ShopRank-2075CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalExact Street2ShopRank-5079.2CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalExact Street2ShopmAP59.8CTL Model (ResNet50-IBN-A, 320x320)
Image RetrievalExact Street2ShopRank-143.2CTL Model (ResNet50, 256x128)
Image RetrievalExact Street2ShopRank-1061.9CTL Model (ResNet50, 256x128)
Image RetrievalExact Street2ShopRank-2066CTL Model (ResNet50, 256x128)
Image RetrievalExact Street2ShopRank-5072.1CTL Model (ResNet50, 256x128)
Image RetrievalExact Street2ShopmAP49.8CTL Model (ResNet50, 256x128)

Related Papers

Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning2025-07-17WhoFi: Deep Person Re-Identification via Wi-Fi Channel Signal Encoding2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17FAR-Net: Multi-Stage Fusion Network with Enhanced Semantic Alignment and Adaptive Reconciliation for Composed Image Retrieval2025-07-17Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16