TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CATs++: Boosting Cost Aggregation with Convolutions and Tr...

CATs++: Boosting Cost Aggregation with Convolutions and Transformers

Seokju Cho, Sunghwan Hong, Seungryong Kim

2022-02-14Semantic correspondence
PaperPDFCode(official)

Abstract

Cost aggregation is a highly important process in image matching tasks, which aims to disambiguate the noisy matching scores. Existing methods generally tackle this by hand-crafted or CNN-based methods, which either lack robustness to severe deformations or inherit the limitation of CNNs that fail to discriminate incorrect matches due to limited receptive fields and inadaptability. In this paper, we introduce Cost Aggregation with Transformers (CATs) to tackle this by exploring global consensus among initial correlation map with the help of some architectural designs that allow us to fully enjoy global receptive fields of self-attention mechanism. Also, to alleviate some of the limitations that CATs may face, i.e., high computational costs induced by the use of a standard transformer that its complexity grows with the size of spatial and feature dimensions, which restrict its applicability only at limited resolution and result in rather limited performance, we propose CATs++, an extension of CATs. Our proposed methods outperform the previous state-of-the-art methods by large margins, setting a new state-of-the-art for all the benchmarks, including PF-WILLOW, PF-PASCAL, and SPair-71k. We further provide extensive ablation studies and analyses.

Results

TaskDatasetMetricValueModel
Image MatchingSPair-71kPCK59.8CATs++
Image MatchingPF-PASCALPCK93.8CATs++
Semantic correspondenceSPair-71kPCK59.8CATs++
Semantic correspondencePF-PASCALPCK93.8CATs++

Related Papers

RL from Physical Feedback: Aligning Large Motion Models with Humanoid Control2025-06-15Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence2025-06-09Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels2025-06-05MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation2025-06-03Cora: Correspondence-aware image editing using few step diffusion2025-05-29Semantic Correspondence: Unified Benchmarking and a Strong Baseline2025-05-23TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval2025-04-07SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations2025-03-28