Yash Patel, Giorgos Tolias, Jiri Matas
This work focuses on learning deep visual representation models for retrieval by exploring the interplay between a new loss function, the batch size, and a new regularization approach. Direct optimization, by gradient descent, of an evaluation metric, is not possible when it is non-differentiable, which is the case for recall in retrieval. A differentiable surrogate loss for the recall is proposed in this work. Using an implementation that sidesteps the hardware constraints of the GPU memory, the method trains with a very large batch size, which is essential for metrics computed on the entire retrieval database. It is assisted by an efficient mixup regularization approach that operates on pairwise scalar similarities and virtually increases the batch size further. The suggested method achieves state-of-the-art performance in several image retrieval benchmarks when used for deep metric learning. For instance-level recognition, the method outperforms similar approaches that train using an approximation of average precision.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Retrieval | iNaturalist | R@1 | 83 | Recall@k Surrogate loss (ViT-B/16) |
| Image Retrieval | iNaturalist | R@16 | 95.9 | Recall@k Surrogate loss (ViT-B/16) |
| Image Retrieval | iNaturalist | R@32 | 97.2 | Recall@k Surrogate loss (ViT-B/16) |
| Image Retrieval | iNaturalist | R@5 | 92.1 | Recall@k Surrogate loss (ViT-B/16) |
| Image Retrieval | iNaturalist | R@1 | 71.8 | Recall@k Surrogate loss (ResNet-50) |
| Image Retrieval | iNaturalist | R@16 | 91.9 | Recall@k Surrogate loss (ResNet-50) |
| Image Retrieval | iNaturalist | R@32 | 94.3 | Recall@k Surrogate loss (ResNet-50) |
| Image Retrieval | iNaturalist | R@5 | 84.7 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Large | Rank-1 | 94.7 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Large | Rank-5 | 97.1 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Large | Rank-1 | 93.8 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Large | Rank-5 | 96.6 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Medium | Rank-1 | 95.2 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Medium | Rank-5 | 97.2 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Medium | Rank-1 | 94.6 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Medium | Rank-5 | 96.9 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Small | Rank-1 | 96.2 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Small | Rank-5 | 98 | Recall@k Surrogate loss (ViT-B/16) |
| Intelligent Surveillance | VehicleID Small | Rank-1 | 95.7 | Recall@k Surrogate loss (ResNet-50) |
| Intelligent Surveillance | VehicleID Small | Rank-5 | 97.9 | Recall@k Surrogate loss (ResNet-50) |
| Metric Learning | CARS196 | R@1 | 89.5 | Recall@k Surrogate loss (ViT-B/16) |
| Metric Learning | CARS196 | R@1 | 88.3 | Recall@k Surrogate loss (ResNet-50) |
| Metric Learning | Stanford Online Products | R@1 | 88 | Recall@k Surrogate Loss (ViT-B/16) |
| Metric Learning | Stanford Online Products | R@1 | 85.1 | Recall@k Surrogate Loss (ViT-B/32) |
| Metric Learning | Stanford Online Products | R@1 | 82.7 | Recall@k Surrogate Loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Large | Rank-1 | 94.7 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Large | Rank-5 | 97.1 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Large | Rank-1 | 93.8 | Recall@k Surrogate loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Large | Rank-5 | 96.6 | Recall@k Surrogate loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Medium | Rank-1 | 95.2 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Medium | Rank-5 | 97.2 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Medium | Rank-1 | 94.6 | Recall@k Surrogate loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Medium | Rank-5 | 96.9 | Recall@k Surrogate loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Small | Rank-1 | 96.2 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Small | Rank-5 | 98 | Recall@k Surrogate loss (ViT-B/16) |
| Vehicle Re-Identification | VehicleID Small | Rank-1 | 95.7 | Recall@k Surrogate loss (ResNet-50) |
| Vehicle Re-Identification | VehicleID Small | Rank-5 | 97.9 | Recall@k Surrogate loss (ResNet-50) |