TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EigenPlaces: Training Viewpoint Robust Models for Visual P...

EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition

Gabriele Berton, Gabriele Trivigno, Barbara Caputo, Carlo Masone

2023-08-21ICCV 2023 1Visual Place RecognitionRetrievalImage Retrieval
PaperPDFCode(official)CodeCode(official)Code

Abstract

Visual Place Recognition is a task that aims to predict the place of an image (called query) based solely on its visual features. This is typically done through image retrieval, where the query is matched to the most similar images from a large database of geotagged photos, using learned global descriptors. A major challenge in this task is recognizing places seen from different viewpoints. To overcome this limitation, we propose a new method, called EigenPlaces, to train our neural network on images from different point of views, which embeds viewpoint robustness into the learned global descriptors. The underlying idea is to cluster the training data so as to explicitly present the model with different views of the same points of interest. The selection of this points of interest is done without the need for extra supervision. We then present experiments on the most comprehensive set of datasets in literature, finding that EigenPlaces is able to outperform previous state of the art on the majority of datasets, while requiring 60\% less GPU memory for training and using 50\% smaller descriptors. The code and trained models for EigenPlaces are available at {\small{\url{https://github.com/gmberton/EigenPlaces}}}, while results with any other baseline can be computed with the codebase at {\small{\url{https://github.com/gmberton/auto_VPR}}}.

Results

TaskDatasetMetricValueModel
Visual Place RecognitionAmsterTimeRecall@148.9EigenPlaces
Visual Place RecognitionSan Francisco Landmark DatasetRecall@189.6EigenPlaces
Visual Place RecognitionSF-XL test v1Recall@184.1EigenPlaces
Visual Place RecognitionPittsburgh-250k-testRecall@194.1EigenPlaces
Visual Place RecognitionPittsburgh-30k-testRecall@192.5EigenPlaces
Visual Place RecognitionTokyo247Recall@193EigenPlaces
Visual Place RecognitionSF-XL test v2Recall@190.8EigenPlaces
Visual Place RecognitionSF-XL test v2Recall@1096.7EigenPlaces
Visual Place RecognitionSF-XL test v2Recall@595.7EigenPlaces
Visual Place RecognitionEynshamRecall@190.7EigenPlaces

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17FAR-Net: Multi-Stage Fusion Network with Enhanced Semantic Alignment and Adaptive Reconciliation for Composed Image Retrieval2025-07-17Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16