TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Deformable Attention Module

Deformable Attention Module

GeneralIntroduced 200042 papers
Source Paper

Description

Deformable Attention Module is an attention module used in the Deformable DETR architecture, which seeks to overcome one issue base Transformer attention in that it looks over all possible spatial locations. Inspired by deformable convolution, the deformable attention module only attends to a small set of key sampling points around a reference point, regardless of the spatial size of the feature maps. By assigning only a small fixed number of keys for each query, the issues of convergence and feature spatial resolution can be mitigated.

Given an input feature map x∈RC×H×Wx \in \mathbb{R}^{C \times H \times W}x∈RC×H×W, let qqq index a query element with content feature z_q\mathbf{z}\_{q}z_q and a 2-d reference point p_q\mathbf{p}\_{q}p_q, the deformable attention feature is calculated by:

where mmm indexes the attention head, kkk indexes the sampled keys, and KKK is the total sampled key number (K≪HW).Δpmqk(K \ll H W) . \Delta p_{m q k}(K≪HW).Δpmqk​ and AmqkA_{m q k}Amqk​ denote the sampling offset and attention weight of the kth k^{\text {th }}kth  sampling point in the mth m^{\text {th }}mth  attention head, respectively. The scalar attention weight AmqkA_{m q k}Amqk​ lies in the range [0,1][0,1][0,1], normalized by ∑k=1KAmqk=1.Δpmqk∈R2\sum_{k=1}^{K} A_{m q k}=1 . \Delta \mathbf{p}_{m q k} \in \mathbb{R}^{2}∑k=1K​Amqk​=1.Δpmqk​∈R2 are of 2-d real numbers with unconstrained range. As p_q+Δp_mqkp\_{q}+\Delta p\_{m q k}p_q+Δp_mqk is fractional, bilinear interpolation is applied as in Dai et al. (2017) in computing x(p_q+Δp_mqk)\mathbf{x}\left(\mathbf{p}\_{q}+\Delta \mathbf{p}\_{m q k}\right)x(p_q+Δp_mqk). Both Δp_mqk\Delta \mathbf{p}\_{m q k}Δp_mqk and A_mqkA\_{m q k}A_mqk are obtained via linear projection over the query feature z_q.z\_{q} .z_q. In implementation, the query feature z_qz\_{q}z_q is fed to a linear projection operator of 3MK3 M K3MK channels, where the first 2MK2 M K2MK channels encode the sampling offsets Δp_mqk\Delta p\_{m q k}Δp_mqk, and the remaining MKM KMK channels are fed to a softmax operator to obtain the attention weights A_mqkA\_{m q k}A_mqk.

Papers Using This Method

WalnutData: A UAV Remote Sensing Dataset of Green Walnuts and Model Evaluation2025-02-27Advancing SEM Based Nano-Scale Defect Analysis in Semiconductor Manufacturing for Advanced IC Nodes2024-09-06U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training2024-08-11Fisher-aware Quantization for DETR Detectors with Critical-category Objectives2024-07-03Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning2024-06-20Understanding differences in applying DETR to natural and medical images2024-05-27Infrared Adversarial Car Stickers2024-05-16LDTR: Transformer-based Lane Detection with Anchor-chain Representation2024-03-21Generative Region-Language Pretraining for Open-Ended Object Detection2024-03-15KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling2024-01-01Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective2024-01-01Mono3DVG: 3D Visual Grounding in Monocular Images2023-12-13Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient ?2023-10-30DAC-DETR: Divide the Attention Layers and Conquer2023-09-21A Spatial-Temporal Deformable Attention based Framework for Breast Lesion Detection in Videos2023-09-09Selecting Learnable Training Samples is All DETRs Need in Crowded Pedestrian Detection2023-05-18Robust Traffic Light Detection Using Salience-Sensitive Loss: Computational Framework and Evaluations2023-05-08Continual Detection Transformer for Incremental Object Detection2023-04-06VDDT: Improving Vessel Detection with Deformable Transfomer2023-03-15ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection2023-03-09