TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BoQ: A Place is Worth a Bag of Learnable Queries

BoQ: A Place is Worth a Bag of Learnable Queries

Amar Ali-bey, Brahim Chaib-Draa, Philippe Giguère

2024-05-12CVPR 2024 1Image Similarity SearchVisual Place RecognitionRetrieval
PaperPDFCode(official)

Abstract

In visual place recognition, accurately identifying and matching images of locations under varying environmental conditions and viewpoints remains a significant challenge. In this paper, we introduce a new technique, called Bag-of-Queries (BoQ), which learns a set of global queries designed to capture universal place-specific attributes. Unlike existing methods that employ self-attention and generate the queries directly from the input features, BoQ employs distinct learnable global queries, which probe the input features via cross-attention, ensuring consistent information aggregation. In addition, our technique provides an interpretable attention mechanism and integrates with both CNN and Vision Transformer backbones. The performance of BoQ is demonstrated through extensive experiments on 14 large-scale benchmarks. It consistently outperforms current state-of-the-art techniques including NetVLAD, MixVPR and EigenPlaces. Moreover, as a global retrieval technique (one-stage), BoQ surpasses two-stage retrieval methods, such as Patch-NetVLAD, TransVPR and R2Former, all while being orders of magnitude faster and more efficient. The code and model weights are publicly available at https://github.com/amaralibey/Bag-of-Queries.

Results

TaskDatasetMetricValueModel
Visual Place RecognitionSVOX-SnowRecall@198.7BoQ (ResNet-50)
Visual Place RecognitionAmsterTimeRecall@163BoQ
Visual Place RecognitionAmsterTimeRecall@1085.1BoQ
Visual Place RecognitionAmsterTimeRecall@581.6BoQ
Visual Place RecognitionAmsterTimeRecall@152.2BoQ (ResNet-50)
Visual Place RecognitionNordlandRecall@190.6BoQ
Visual Place RecognitionNordlandRecall@1097.5BoQ
Visual Place RecognitionNordlandRecall@596BoQ
Visual Place RecognitionNordlandRecall@183.1BoQ (ResNet-50)
Visual Place RecognitionSan Francisco Landmark DatasetRecall@193.6BoQ
Visual Place RecognitionSan Francisco Landmark DatasetRecall@1096.5BoQ
Visual Place RecognitionSan Francisco Landmark DatasetRecall@595.8BoQ
Visual Place RecognitionSVOX-NightRecall@187.1BoQ (ResNet-50)
Visual Place RecognitionSt LuciaRecall@1100BoQ (DINOv2)
Visual Place RecognitionSt LuciaRecall@5100BoQ (DINOv2)
Visual Place RecognitionSt LuciaRecall@10100BoQ
Visual Place RecognitionSt LuciaRecall@5100BoQ
Visual Place RecognitionPittsburgh-250k-testRecall@196.6BoQ
Visual Place RecognitionPittsburgh-250k-testRecall@1099.5BoQ
Visual Place RecognitionPittsburgh-250k-testRecall@599.1BoQ
Visual Place RecognitionPittsburgh-250k-testRecall@195BoQ (ResNet-50)
Visual Place RecognitionPittsburgh-250k-testRecall@1099.1BoQ (ResNet-50)
Visual Place RecognitionPittsburgh-250k-testRecall@598.5BoQ (ResNet-50)
Visual Place RecognitionSPEDRecall@192.5BoQ
Visual Place RecognitionSPEDRecall@1096.7BoQ
Visual Place RecognitionSPEDRecall@595.9BoQ
Visual Place RecognitionSPEDRecall@186.5BoQ (ResNet-50)
Visual Place RecognitionSPEDRecall@1095.7BoQ (ResNet-50)
Visual Place RecognitionSPEDRecall@593.4BoQ (ResNet-50)
Visual Place RecognitionPittsburgh-30k-testRecall@193.7BoQ
Visual Place RecognitionPittsburgh-30k-testRecall@1097.9BoQ
Visual Place RecognitionPittsburgh-30k-testRecall@597.1BoQ
Visual Place RecognitionPittsburgh-30k-testRecall@192.4BoQ (ResNet-50)
Visual Place RecognitionTokyo247Recall@198.1BoQ
Visual Place RecognitionTokyo247Recall@1098.7BoQ
Visual Place RecognitionTokyo247Recall@598.1BoQ
Visual Place RecognitionMapillary valRecall@193.8BoQ
Visual Place RecognitionMapillary valRecall@1097BoQ
Visual Place RecognitionMapillary valRecall@596.8BoQ
Visual Place RecognitionMapillary valRecall@191.2BoQ (ResNet-50)
Visual Place RecognitionMapillary valRecall@1096.1BoQ (ResNet-50)
Visual Place RecognitionMapillary valRecall@595.3BoQ (ResNet-50)
Visual Place RecognitionSVOX-RainRecall@196.2BoQ (ResNet-50)
Visual Place RecognitionMapillary testRecall@179BoQ
Visual Place RecognitionMapillary testRecall@1092BoQ
Visual Place RecognitionMapillary testRecall@590.3BoQ
Visual Place RecognitionEynshamRecall@192.2BoQ
Visual Place RecognitionEynshamRecall@1096.4BoQ
Visual Place RecognitionEynshamRecall@595.6BoQ
Visual Place RecognitionEynshamRecall@191.3BoQ (ResNet-50)
Visual Place RecognitionSVOX-OvercastRecall@197.8BoQ (ResNet-50)
Visual Place RecognitionSVOX-SunRecall@195.9BoQ (ResNet-50)
Visual Place RecognitionNordland* (2760 queries)Recall@181.3BoQ
Visual Place RecognitionNordland* (2760 queries)Recall@1094.8BoQ
Visual Place RecognitionNordland* (2760 queries)Recall@592.5BoQ

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16Context-Aware Search and Retrieval Over Erasure Channels2025-07-16