TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Rethinking Visual Geo-localization for Large-Scale Applica...

Rethinking Visual Geo-localization for Large-Scale Applications

Gabriele Berton, Carlo Masone, Barbara Caputo

2022-04-05CVPR 2022 1Image Classificationgeo-localizationVisual Place RecognitionContrastive LearningImage Retrieval
PaperPDFCodeCode(official)

Abstract

Visual Geo-localization (VG) is the task of estimating the position where a given photo was taken by comparing it with a large database of images of known locations. To investigate how existing techniques would perform on a real-world city-wide VG application, we build San Francisco eXtra Large, a new dataset covering a whole city and providing a wide range of challenging cases, with a size 30x bigger than the previous largest dataset for visual geo-localization. We find that current methods fail to scale to such large datasets, therefore we design a new highly scalable training technique, called CosPlace, which casts the training as a classification problem avoiding the expensive mining needed by the commonly used contrastive learning. We achieve state-of-the-art performance on a wide range of datasets and find that CosPlace is robust to heavy domain changes. Moreover, we show that, compared to the previous state-of-the-art, CosPlace requires roughly 80% less GPU memory at train time, and it achieves better results with 8x smaller descriptors, paving the way for city-wide real-world visual geo-localization. Dataset, code and trained models are available for research purposes at https://github.com/gmberton/CosPlace.

Results

TaskDatasetMetricValueModel
Visual Place RecognitionNardo-Air RRecall@191.55CosPlace
Visual Place RecognitionOxford RobotCar DatasetRecall@191.1CosPlace
Visual Place RecognitionSF-XL test v1Recall@164.7CosPlace
Visual Place RecognitionSF-XL test v1Recall@1076.6CosPlace
Visual Place RecognitionSF-XL test v1Recall@573.3CosPlace
Visual Place RecognitionMid-Atlantic RidgeRecall@120.79CosPlace
Visual Place RecognitionSt LuciaRecall@199.59CosPlace
Visual Place RecognitionSt LuciaRecall@599.9CosPlace
Visual Place RecognitionPittsburgh-250k-testRecall@191.5CosPlace
Visual Place RecognitionPittsburgh-250k-testRecall@1097.9CosPlace
Visual Place RecognitionPittsburgh-250k-testRecall@596.9CosPlace
Visual Place RecognitionHawkinsRecall@131.36CosPlace
Visual Place RecognitionLaurel CavernsRecall@124.11CosPlace
Visual Place RecognitionGardens PointRecall@174CosPlace
Visual Place RecognitionPittsburgh-30k-testRecall@190.45CosPlace
Visual Place RecognitionPittsburgh-30k-testRecall@190.4CosPlace (ResNet-101 2048-D)
Visual Place RecognitionPittsburgh-30k-testRecall@595.7CosPlace (ResNet-101 2048-D)
Visual Place RecognitionTokyo247Recall@182.2CosPlace
Visual Place RecognitionTokyo247Recall@1096.5CosPlace (ResNet-101 2048-D)
Visual Place RecognitionTokyo247Recall@595.9CosPlace (ResNet-101 2048-D)
Visual Place RecognitionSF-XL test v2Recall@183.4CosPlace
Visual Place RecognitionSF-XL test v2Recall@1094.1CosPlace
Visual Place RecognitionSF-XL test v2Recall@591.6CosPlace
Visual Place RecognitionVP-AirRecall@18.12CosPlace
Visual Place RecognitionMapillary valRecall@186.7CosPlace (ResNet-101 2048-D)
Visual Place RecognitionMapillary valRecall@1093.4CosPlace (ResNet-101 2048-D)
Visual Place RecognitionMapillary valRecall@592.1CosPlace (ResNet-101 2048-D)
Visual Place RecognitionMapillary valRecall@1091.8CosPlace
Visual Place RecognitionMapillary valRecall@589.9CosPlace
Visual Place Recognition17 PlacesRecall@161.08CosPlace
Visual Place RecognitionMSLSRecall@179.6CosPlace
Visual Place RecognitionBaidu MallRecall@141.62CosPlace

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17