TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Attention-Based Multimodal Image Matching

Attention-Based Multimodal Image Matching

Aviad Moreshet, Yosi Keller

2021-03-20Patch Matching
PaperPDFCode(official)

Abstract

We propose an attention-based approach for multimodal image patch matching using a Transformer encoder attending to the feature maps of a multiscale Siamese CNN. Our encoder is shown to efficiently aggregate multiscale image embeddings while emphasizing task-specific appearance-invariant image cues. We also introduce an attention-residual architecture, using a residual connection bypassing the encoder. This additional learning signal facilitates end-to-end training from scratch. Our approach is experimentally shown to achieve new state-of-the-art accuracy on both multimodal and single modality benchmarks, illustrating its general applicability. To the best of our knowledge, this is the first successful implementation of the Transformer encoder architecture to the multimodal image patch matching task.

Results

TaskDatasetMetricValueModel
Image MatchingBrown DatasetFPR950.9Multiscale Transformer Encoder
Image MatchingVisNirFPR951.44Multiscale Transformer Encoder
Patch MatchingBrown DatasetFPR950.9Multiscale Transformer Encoder
Patch MatchingVisNirFPR951.44Multiscale Transformer Encoder

Related Papers

Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction2025-05-12MicroFlow: Domain-Specific Optical Flow for Ground Deformation Estimation in Seismic Events2025-04-18The Marine Debris Forward-Looking Sonar Datasets2025-03-28Fence Theorem: Preprocessing is Dual-Objective Semantic Structure Isolator in 3D Anomaly Detection2025-03-03FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization2025-01-17SurfPatch: Enabling Patch Matching for Exploratory Stream Surface Visualization2025-01-01Why and How: Knowledge-Guided Learning for Cross-Spectral Image Patch Matching2024-12-15NeRF-Texture: Synthesizing Neural Radiance Field Textures2024-12-13