Elias Ramzi, Nicolas Audebert, Nicolas Thome, Clément Rambour, Xavier Bitot
Image Retrieval is commonly evaluated with Average Precision (AP) or Recall@k. Yet, those metrics, are limited to binary labels and do not take into account errors' severity. This paper introduces a new hierarchical AP training method for pertinent image retrieval (HAP-PIER). HAPPIER is based on a new H-AP metric, which leverages a concept hierarchy to refine AP by integrating errors' importance and better evaluate rankings. To train deep models with H-AP, we carefully study the problem's structure and design a smooth lower bound surrogate combined with a clustering loss that ensures consistent ordering. Extensive experiments on 6 datasets show that HAPPIER significantly outperforms state-of-the-art methods for hierarchical retrieval, while being on par with the latest approaches when evaluating fine-grained ranking performances. Finally, we show that HAPPIER leads to better organization of the embedding space, and prevents most severe failure cases of non-hierarchical methods. Our code is publicly available at: https://github.com/elias-ramzi/HAPPIER.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Retrieval | iNaturalist | R@1 | 71 | HAPPIER_F (ResNet-50) |
| Image Retrieval | iNaturalist | R@1 | 70.7 | HAPPIER (ResNet-50) |
| Metric Learning | DyML-Vehicle | Average-mAP | 37 | HAPPIER |
| Metric Learning | Stanford Online Products | R@1 | 81.8 | HAPPIER_F |
| Metric Learning | Stanford Online Products | R@1 | 81 | HAPPIER |
| Metric Learning | DyML-Animal | Average-mAP | 43.8 | HAPPIER |
| Metric Learning | DyML-Product | Average-mAP | 38 | HAPPIER |