TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MinkLoc++: Lidar and Monocular Image Fusion for Place Reco...

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Jacek Komorowski, Monika Wysoczanska, Tomasz Trzcinski

2021-04-12Autonomous VehiclesMetric LearningVisual Place RecognitionMultimodal Deep Learning3D Place Recognition
PaperPDFCode(official)

Abstract

We introduce a discriminative multimodal descriptor based on a pair of sensor readings: a point cloud from a LiDAR and an image from an RGB camera. Our descriptor, named MinkLoc++, can be used for place recognition, re-localization and loop closure purposes in robotics or autonomous vehicles applications. We use late fusion approach, where each modality is processed separately and fused in the final part of the processing pipeline. The proposed method achieves state-of-the-art performance on standard place recognition benchmarks. We also identify dominating modality problem when training a multimodal descriptor. The problem manifests itself when the network focuses on a modality with a larger overfit to the training data. This drives the loss down during the training but leads to suboptimal performance on the evaluation set. In this work we describe how to detect and mitigate such risk when using a deep metric learning approach to train a multimodal neural network. Our code is publicly available on the project website: https://github.com/jac99/MinkLocMultimodal.

Results

TaskDatasetMetricValueModel
Visual Place RecognitionOxford RobotCar (LiDAR 4096 points+RGB)recall@top196.7MinkLoc++ (LiDAR+RGB)
Visual Place RecognitionOxford RobotCar (LiDAR 4096 points+RGB)recall@top1%99.1MinkLoc++ (LiDAR+RGB)
Visual Place RecognitionCS-Campus3DAR@167.06Minkloc3Dv2
Visual Place RecognitionCS-Campus3DAR@1 cross-source52.46Minkloc3Dv2
Visual Place RecognitionCS-Campus3DAR@1%76.68Minkloc3Dv2
Visual Place RecognitionCS-Campus3DAR@1% cross-source83.48Minkloc3Dv2

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Unsupervised Ground Metric Learning2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16$\texttt{Droid}$: A Resource Suite for AI-Generated Code Detection2025-07-11Fast and Accurate Collision Probability Estimation for Autonomous Vehicles using Adaptive Sigma-Point Sampling2025-07-08Robustifying 3D Perception through Least-Squares Multi-Agent Graphs Object Tracking2025-07-07Grid-Reg: Grid-Based SAR and Optical Image Registration Across Platforms2025-07-06