NetVLAD: CNN architecture for weakly supervised place recognition

Relja Arandjelović, Petr Gronat, Akihiko Torii, Tomas Pajdla, Josef Sivic

2015-11-23CVPR 2016 6Visual Place Recognition Retrieval Image Retrieval

Paper PDF Code Code Code Code Code Code Code(official)Code Code Code Code Code Code Code

Abstract

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present the following three principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the "Vector of Locally Aggregated Descriptors" image representation commonly used in image retrieval. The layer is readily pluggable into any CNN architecture and amenable to training via backpropagation. Second, we develop a training procedure, based on a new weakly supervised ranking loss, to learn parameters of the architecture in an end-to-end manner from images depicting the same places over time downloaded from Google Street View Time Machine. Finally, we show that the proposed architecture significantly outperforms non-learnt image representations and off-the-shelf CNN descriptors on two challenging place recognition benchmarks, and improves over current state-of-the-art compact image representations on standard image retrieval benchmarks.

Results

Task	Dataset	Metric	Value	Model
Visual Place Recognition	Nardo-Air R	Recall@1	60.56	NetVLAD
Visual Place Recognition	Oxford RobotCar Dataset	Recall@1	52.88	NetVLAD
Visual Place Recognition	Nardo-Air	Recall@1	19.72	NetVLAD
Visual Place Recognition	Mid-Atlantic Ridge	Recall@1	25.74	NetVLAD
Visual Place Recognition	St Lucia	Recall@1	57.92	NetVLAD
Visual Place Recognition	Hawkins	Recall@1	34.75	NetVLAD
Visual Place Recognition	Laurel Caverns	Recall@1	39.29	NetVLAD
Visual Place Recognition	Berlin Kudamm	Recall@1	38.21	NetVLAD
Visual Place Recognition	Gardens Point	Recall@1	58.5	NetVLAD
Visual Place Recognition	Pittsburgh-30k-test	Recall@1	86.08	NetVLAD
Visual Place Recognition	VP-Air	Recall@1	6.39	NetVLAD
Visual Place Recognition	17 Places	Recall@1	61.58	NetVLAD
Visual Place Recognition	Baidu Mall	Recall@1	53.1	NetVLAD

NetVLAD: CNN architecture for weakly supervised place recognition

Abstract

Results

Related Papers

NetVLAD: CNN architecture for weakly supervised place recognition

Abstract

Results

Related Papers