TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/NetVLAD: CNN architecture for weakly supervised place reco...

NetVLAD: CNN architecture for weakly supervised place recognition

Relja Arandjelović, Petr Gronat, Akihiko Torii, Tomas Pajdla, Josef Sivic

2015-11-23CVPR 2016 6Visual Place RecognitionRetrievalImage Retrieval
PaperPDFCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCode

Abstract

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present the following three principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the "Vector of Locally Aggregated Descriptors" image representation commonly used in image retrieval. The layer is readily pluggable into any CNN architecture and amenable to training via backpropagation. Second, we develop a training procedure, based on a new weakly supervised ranking loss, to learn parameters of the architecture in an end-to-end manner from images depicting the same places over time downloaded from Google Street View Time Machine. Finally, we show that the proposed architecture significantly outperforms non-learnt image representations and off-the-shelf CNN descriptors on two challenging place recognition benchmarks, and improves over current state-of-the-art compact image representations on standard image retrieval benchmarks.

Results

TaskDatasetMetricValueModel
Visual Place RecognitionNardo-Air RRecall@160.56NetVLAD
Visual Place RecognitionOxford RobotCar DatasetRecall@152.88NetVLAD
Visual Place RecognitionNardo-AirRecall@119.72NetVLAD
Visual Place RecognitionMid-Atlantic RidgeRecall@125.74NetVLAD
Visual Place RecognitionSt LuciaRecall@157.92NetVLAD
Visual Place RecognitionHawkinsRecall@134.75NetVLAD
Visual Place RecognitionLaurel CavernsRecall@139.29NetVLAD
Visual Place RecognitionBerlin KudammRecall@138.21NetVLAD
Visual Place RecognitionGardens PointRecall@158.5NetVLAD
Visual Place RecognitionPittsburgh-30k-testRecall@186.08NetVLAD
Visual Place RecognitionVP-AirRecall@16.39NetVLAD
Visual Place Recognition17 PlacesRecall@161.58NetVLAD
Visual Place RecognitionBaidu MallRecall@153.1NetVLAD

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17FAR-Net: Multi-Stage Fusion Network with Enhanced Semantic Alignment and Adaptive Reconciliation for Composed Image Retrieval2025-07-17Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16