TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/VMLoc: Variational Fusion For Learning-Based Multimodal Ca...

VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

Kaichen Zhou, Changhao Chen, Bing Wang, Muhamad Risqi U. Saputra, Niki Trigoni, Andrew Markham

2020-03-12Visual LocalizationCamera RelocalizationCamera Localization
PaperPDFCode(official)

Abstract

Recent learning-based approaches have achieved impressive results in the field of single-shot camera localization. However, how best to fuse multiple modalities (e.g., image and depth) and to deal with degraded or missing input are less well studied. In particular, we note that previous approaches towards deep fusion do not perform significantly better than models employing a single modality. We conjecture that this is because of the naive approaches to feature space fusion through summation or concatenation which do not take into account the different strengths of each modality. To address this, we propose an end-to-end framework, termed VMLoc, to fuse different sensor inputs into a common latent space through a variational Product-of-Experts (PoE) followed by attention-based fusion. Unlike previous multimodal variational works directly adapting the objective function of vanilla variational auto-encoder, we show how camera localization can be accurately estimated through an unbiased objective function based on importance weighting. Our model is extensively evaluated on RGB-D datasets and the results prove the efficacy of our model. The source code is available at https://github.com/kaichen-z/VMLoc.

Results

TaskDatasetMetricValueModel
Visual LocalizationOxford Radar RobotCar (Full-6)Mean Translation Error15.11VMLoc

Related Papers

Kaleidoscopic Background Attack: Disrupting Pose Estimation with Multi-Fold Radial Symmetry Textures2025-07-14Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09MatChA: Cross-Algorithm Matching with Feature Augmentation2025-06-27OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography2025-06-26Semantic and Feature Guided Uncertainty Quantification of Visual Localization for Autonomous Vehicles2025-06-18Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints2025-06-11Robust Visual Localization via Semantic-Guided Multi-Scale Transformer2025-06-10Deep Learning Reforms Image Matching: A Survey and Outlook2025-06-05