Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Changwei Wang, Shunpeng Chen, Yukun Song, Rongtao Xu, Zherui Zhang, Jiguang Zhang, Haoran Yang, Yu Zhang, Kexue Fu, Shide Du, Zhiwei Xu, Longxiang Gao, Li Guo, Shibiao Xu

2025-04-14Visual Place Recognition Re-Ranking Retrieval Image Retrieval

Paper PDF Code(official)

Abstract

Visual Place Recognition (VPR) is aimed at predicting the location of a query image by referencing a database of geotagged images. For VPR task, often fewer discriminative local regions in an image produce important effects while mundane background regions do not contribute or even cause perceptual aliasing because of easy overlap. However, existing methods lack precisely modeling and full exploitation of these discriminative regions. In this paper, we propose the Focus on Local (FoL) approach to stimulate the performance of image retrieval and re-ranking in VPR simultaneously by mining and exploiting reliable discriminative local regions in images and introducing pseudo-correlation supervision. First, we design two losses, Extraction-Aggregation Spatial Alignment Loss (SAL) and Foreground-Background Contrast Enhancement Loss (CEL), to explicitly model reliable discriminative local regions and use them to guide the generation of global representations and efficient re-ranking. Second, we introduce a weakly-supervised local feature training strategy based on pseudo-correspondences obtained from aggregating global features to alleviate the lack of local correspondences ground truth for the VPR task. Third, we suggest an efficient re-ranking pipeline that is efficiently and precisely based on discriminative region guidance. Finally, experimental results show that our FoL achieves the state-of-the-art on multiple VPR benchmarks in both image retrieval and re-ranking stages and also significantly outperforms existing two-stage VPR methods in terms of computational efficiency. Code and models are available at https://github.com/chenshunpeng/FoL

Results

Task	Dataset	Metric	Value	Model
Visual Place Recognition	SVOX-Snow	Recall@1	99.3	FoL
Visual Place Recognition	SVOX-Snow	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Snow	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Snow	Recall@1	99.1	FoL-global
Visual Place Recognition	SVOX-Snow	Recall@10	99.8	FoL-global
Visual Place Recognition	SVOX-Snow	Recall@5	99.7	FoL-global
Visual Place Recognition	AmsterTime	Recall@1	70.1	FoL
Visual Place Recognition	AmsterTime	Recall@10	90	FoL
Visual Place Recognition	AmsterTime	Recall@5	91.8	FoL
Visual Place Recognition	AmsterTime	Recall@1	64.6	FoL-global
Visual Place Recognition	AmsterTime	Recall@10	84.3	FoL-global
Visual Place Recognition	AmsterTime	Recall@5	88.2	FoL-global
Visual Place Recognition	Nordland	Recall@1	92.6	FoL
Visual Place Recognition	Nordland	Recall@10	98	FoL
Visual Place Recognition	Nordland	Recall@5	96.9	FoL
Visual Place Recognition	Nordland	Recall@1	87.8	FoL-global
Visual Place Recognition	Nordland	Recall@10	96.4	FoL-global
Visual Place Recognition	Nordland	Recall@5	94.5	FoL-global
Visual Place Recognition	SVOX-Night	Recall@1	98.8	FoL
Visual Place Recognition	SVOX-Night	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Night	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Night	Recall@1	98.3	FoL-global
Visual Place Recognition	SVOX-Night	Recall@10	99.6	FoL-global
Visual Place Recognition	SVOX-Night	Recall@5	99.6	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@1	61.8	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@10	77.6	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@5	77.6	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@1	51.3	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@10	737	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@5	65.8	FoL-global
Visual Place Recognition	SF-XL Night	Recall@1	60.5	FoL
Visual Place Recognition	SF-XL Night	Recall@10	75.8	FoL
Visual Place Recognition	SF-XL Night	Recall@5	72.8	FoL
Visual Place Recognition	SF-XL Night	Recall@1	53.4	FoL-global
Visual Place Recognition	SF-XL Night	Recall@10	71.7	FoL-global
Visual Place Recognition	SF-XL Night	Recall@5	65.9	FoL-global
Visual Place Recognition	St Lucia	Recall@1	99.9	FoL-global
Visual Place Recognition	St Lucia	Recall@10	100	FoL-global
Visual Place Recognition	St Lucia	Recall@5	100	FoL-global
Visual Place Recognition	St Lucia	Recall@1	99.9	FoL
Visual Place Recognition	St Lucia	Recall@10	100	FoL
Visual Place Recognition	St Lucia	Recall@5	100	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@1	97	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@10	99.2	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@5	99.5	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@1	96.5	FoL-global
Visual Place Recognition	Pittsburgh-250k-test	Recall@10	99.1	FoL-global
Visual Place Recognition	Pittsburgh-250k-test	Recall@5	99.5	FoL-global
Visual Place Recognition	SVOX	Recall@1	98.9	FoL
Visual Place Recognition	SVOX	Recall@10	99.7	FoL
Visual Place Recognition	SVOX	Recall@5	99.6	FoL
Visual Place Recognition	SVOX	Recall@1	98.4	FoL-global
Visual Place Recognition	SVOX	Recall@10	99.6	FoL-global
Visual Place Recognition	SVOX	Recall@5	99.4	FoL-global
Visual Place Recognition	SPED	Recall@1	92.1	FoL-global
Visual Place Recognition	SPED	Recall@10	98	FoL-global
Visual Place Recognition	SPED	Recall@5	96.5	FoL-global
Visual Place Recognition	SPED	Recall@1	91.8	FoL
Visual Place Recognition	SPED	Recall@10	97.4	FoL
Visual Place Recognition	SPED	Recall@5	96.5	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@1	94.5	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@10	98.2	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@5	97.4	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@1	93.9	FoL-global
Visual Place Recognition	Pittsburgh-30k-test	Recall@10	98.1	FoL-global
Visual Place Recognition	Pittsburgh-30k-test	Recall@5	97.2	FoL-global
Visual Place Recognition	Tokyo247	Recall@1	98.4	FoL
Visual Place Recognition	Tokyo247	Recall@10	99.4	FoL
Visual Place Recognition	Tokyo247	Recall@5	99.1	FoL
Visual Place Recognition	Tokyo247	Recall@1	96.2	FoL-global
Visual Place Recognition	Tokyo247	Recall@10	98.7	FoL-global
Visual Place Recognition	Tokyo247	Recall@5	98.7	FoL-global
Visual Place Recognition	Mapillary val	Recall@1	93.5	FoL
Visual Place Recognition	Mapillary val	Recall@10	97.6	FoL
Visual Place Recognition	Mapillary val	Recall@5	96.9	FoL
Visual Place Recognition	Mapillary val	Recall@1	93.1	FoL-global
Visual Place Recognition	Mapillary val	Recall@10	97.4	FoL-global
Visual Place Recognition	Mapillary val	Recall@5	96.9	FoL-global
Visual Place Recognition	SVOX-Rain	Recall@1	98.2	FoL
Visual Place Recognition	SVOX-Rain	Recall@1	96.5	FoL-global
Visual Place Recognition	Mapillary test	Recall@1	80	FoL
Visual Place Recognition	Mapillary test	Recall@10	93	FoL
Visual Place Recognition	Mapillary test	Recall@5	90.9	FoL
Visual Place Recognition	Mapillary test	Recall@1	78.7	FoL-global
Visual Place Recognition	Mapillary test	Recall@10	93	FoL-global
Visual Place Recognition	Mapillary test	Recall@5	90.8	FoL-global
Visual Place Recognition	Eynsham	Recall@1	92.4	FoL
Visual Place Recognition	Eynsham	Recall@10	95.8	FoL
Visual Place Recognition	Eynsham	Recall@5	96.6	FoL
Visual Place Recognition	Eynsham	Recall@1	91.7	FoL-global
Visual Place Recognition	Eynsham	Recall@10	95.3	FoL-global
Visual Place Recognition	Eynsham	Recall@5	96.2	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@1	98.2	FoL
Visual Place Recognition	SVOX-Overcast	Recall@10	99.7	FoL
Visual Place Recognition	SVOX-Overcast	Recall@5	99.3	FoL
Visual Place Recognition	SVOX-Overcast	Recall@1	97.9	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@10	99.3	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@5	99.2	FoL-global
Visual Place Recognition	SVOX-Sun	Recall@1	98.8	FoL
Visual Place Recognition	SVOX-Sun	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Sun	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Sun	Recall@1	98.1	FoL- global
Visual Place Recognition	SVOX-Sun	Recall@10	99.5	FoL- global
Visual Place Recognition	SVOX-Sun	Recall@5	99.4	FoL- global
Visual Place Recognition	Nordland* (2760 queries)	Recall@1	85.5	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@10	96.5	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@5	94.6	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@1	78.3	FoL-global
Visual Place Recognition	Nordland* (2760 queries)	Recall@10	94	FoL-global
Visual Place Recognition	Nordland* (2760 queries)	Recall@5	90.8	FoL-global

Abstract

Results

Task	Dataset	Metric	Value	Model
Visual Place Recognition	SVOX-Snow	Recall@1	99.3	FoL
Visual Place Recognition	SVOX-Snow	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Snow	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Snow	Recall@1	99.1	FoL-global
Visual Place Recognition	SVOX-Snow	Recall@10	99.8	FoL-global
Visual Place Recognition	SVOX-Snow	Recall@5	99.7	FoL-global
Visual Place Recognition	AmsterTime	Recall@1	70.1	FoL
Visual Place Recognition	AmsterTime	Recall@10	90	FoL
Visual Place Recognition	AmsterTime	Recall@5	91.8	FoL
Visual Place Recognition	AmsterTime	Recall@1	64.6	FoL-global
Visual Place Recognition	AmsterTime	Recall@10	84.3	FoL-global
Visual Place Recognition	AmsterTime	Recall@5	88.2	FoL-global
Visual Place Recognition	Nordland	Recall@1	92.6	FoL
Visual Place Recognition	Nordland	Recall@10	98	FoL
Visual Place Recognition	Nordland	Recall@5	96.9	FoL
Visual Place Recognition	Nordland	Recall@1	87.8	FoL-global
Visual Place Recognition	Nordland	Recall@10	96.4	FoL-global
Visual Place Recognition	Nordland	Recall@5	94.5	FoL-global
Visual Place Recognition	SVOX-Night	Recall@1	98.8	FoL
Visual Place Recognition	SVOX-Night	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Night	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Night	Recall@1	98.3	FoL-global
Visual Place Recognition	SVOX-Night	Recall@10	99.6	FoL-global
Visual Place Recognition	SVOX-Night	Recall@5	99.6	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@1	61.8	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@10	77.6	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@5	77.6	FoL
Visual Place Recognition	SF-XL Occlusion	Recall@1	51.3	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@10	737	FoL-global
Visual Place Recognition	SF-XL Occlusion	Recall@5	65.8	FoL-global
Visual Place Recognition	SF-XL Night	Recall@1	60.5	FoL
Visual Place Recognition	SF-XL Night	Recall@10	75.8	FoL
Visual Place Recognition	SF-XL Night	Recall@5	72.8	FoL
Visual Place Recognition	SF-XL Night	Recall@1	53.4	FoL-global
Visual Place Recognition	SF-XL Night	Recall@10	71.7	FoL-global
Visual Place Recognition	SF-XL Night	Recall@5	65.9	FoL-global
Visual Place Recognition	St Lucia	Recall@1	99.9	FoL-global
Visual Place Recognition	St Lucia	Recall@10	100	FoL-global
Visual Place Recognition	St Lucia	Recall@5	100	FoL-global
Visual Place Recognition	St Lucia	Recall@1	99.9	FoL
Visual Place Recognition	St Lucia	Recall@10	100	FoL
Visual Place Recognition	St Lucia	Recall@5	100	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@1	97	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@10	99.2	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@5	99.5	FoL
Visual Place Recognition	Pittsburgh-250k-test	Recall@1	96.5	FoL-global
Visual Place Recognition	Pittsburgh-250k-test	Recall@10	99.1	FoL-global
Visual Place Recognition	Pittsburgh-250k-test	Recall@5	99.5	FoL-global
Visual Place Recognition	SVOX	Recall@1	98.9	FoL
Visual Place Recognition	SVOX	Recall@10	99.7	FoL
Visual Place Recognition	SVOX	Recall@5	99.6	FoL
Visual Place Recognition	SVOX	Recall@1	98.4	FoL-global
Visual Place Recognition	SVOX	Recall@10	99.6	FoL-global
Visual Place Recognition	SVOX	Recall@5	99.4	FoL-global
Visual Place Recognition	SPED	Recall@1	92.1	FoL-global
Visual Place Recognition	SPED	Recall@10	98	FoL-global
Visual Place Recognition	SPED	Recall@5	96.5	FoL-global
Visual Place Recognition	SPED	Recall@1	91.8	FoL
Visual Place Recognition	SPED	Recall@10	97.4	FoL
Visual Place Recognition	SPED	Recall@5	96.5	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@1	94.5	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@10	98.2	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@5	97.4	FoL
Visual Place Recognition	Pittsburgh-30k-test	Recall@1	93.9	FoL-global
Visual Place Recognition	Pittsburgh-30k-test	Recall@10	98.1	FoL-global
Visual Place Recognition	Pittsburgh-30k-test	Recall@5	97.2	FoL-global
Visual Place Recognition	Tokyo247	Recall@1	98.4	FoL
Visual Place Recognition	Tokyo247	Recall@10	99.4	FoL
Visual Place Recognition	Tokyo247	Recall@5	99.1	FoL
Visual Place Recognition	Tokyo247	Recall@1	96.2	FoL-global
Visual Place Recognition	Tokyo247	Recall@10	98.7	FoL-global
Visual Place Recognition	Tokyo247	Recall@5	98.7	FoL-global
Visual Place Recognition	Mapillary val	Recall@1	93.5	FoL
Visual Place Recognition	Mapillary val	Recall@10	97.6	FoL
Visual Place Recognition	Mapillary val	Recall@5	96.9	FoL
Visual Place Recognition	Mapillary val	Recall@1	93.1	FoL-global
Visual Place Recognition	Mapillary val	Recall@10	97.4	FoL-global
Visual Place Recognition	Mapillary val	Recall@5	96.9	FoL-global
Visual Place Recognition	SVOX-Rain	Recall@1	98.2	FoL
Visual Place Recognition	SVOX-Rain	Recall@1	96.5	FoL-global
Visual Place Recognition	Mapillary test	Recall@1	80	FoL
Visual Place Recognition	Mapillary test	Recall@10	93	FoL
Visual Place Recognition	Mapillary test	Recall@5	90.9	FoL
Visual Place Recognition	Mapillary test	Recall@1	78.7	FoL-global
Visual Place Recognition	Mapillary test	Recall@10	93	FoL-global
Visual Place Recognition	Mapillary test	Recall@5	90.8	FoL-global
Visual Place Recognition	Eynsham	Recall@1	92.4	FoL
Visual Place Recognition	Eynsham	Recall@10	95.8	FoL
Visual Place Recognition	Eynsham	Recall@5	96.6	FoL
Visual Place Recognition	Eynsham	Recall@1	91.7	FoL-global
Visual Place Recognition	Eynsham	Recall@10	95.3	FoL-global
Visual Place Recognition	Eynsham	Recall@5	96.2	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@1	98.2	FoL
Visual Place Recognition	SVOX-Overcast	Recall@10	99.7	FoL
Visual Place Recognition	SVOX-Overcast	Recall@5	99.3	FoL
Visual Place Recognition	SVOX-Overcast	Recall@1	97.9	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@10	99.3	FoL-global
Visual Place Recognition	SVOX-Overcast	Recall@5	99.2	FoL-global
Visual Place Recognition	SVOX-Sun	Recall@1	98.8	FoL
Visual Place Recognition	SVOX-Sun	Recall@10	99.9	FoL
Visual Place Recognition	SVOX-Sun	Recall@5	99.8	FoL
Visual Place Recognition	SVOX-Sun	Recall@1	98.1	FoL- global
Visual Place Recognition	SVOX-Sun	Recall@10	99.5	FoL- global
Visual Place Recognition	SVOX-Sun	Recall@5	99.4	FoL- global
Visual Place Recognition	Nordland* (2760 queries)	Recall@1	85.5	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@10	96.5	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@5	94.6	FoL
Visual Place Recognition	Nordland* (2760 queries)	Recall@1	78.3	FoL-global
Visual Place Recognition	Nordland* (2760 queries)	Recall@10	94	FoL-global
Visual Place Recognition	Nordland* (2760 queries)	Recall@5	90.8	FoL-global

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Abstract

Results

Related Papers

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Abstract

Results

Related Papers