TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Recurrent Glimpse-based Decoder for Detection with Transfo...

Recurrent Glimpse-based Decoder for Detection with Transformer

Zhe Chen, Jing Zhang, DaCheng Tao

2021-12-09CVPR 2022 1Object Detection
PaperPDFCode(official)

Abstract

Although detection with Transformer (DETR) is increasingly popular, its global attention modeling requires an extremely long training period to optimize and achieve promising detection performance. Alternative to existing studies that mainly develop advanced feature or embedding designs to tackle the training issue, we point out that the Region-of-Interest (RoI) based detection refinement can easily help mitigate the difficulty of training for DETR methods. Based on this, we introduce a novel REcurrent Glimpse-based decOder (REGO) in this paper. In particular, the REGO employs a multi-stage recurrent processing structure to help the attention of DETR gradually focus on foreground objects more accurately. In each processing stage, visual features are extracted as glimpse features from RoIs with enlarged bounding box areas of detection results from the previous stage. Then, a glimpse-based decoder is introduced to provide refined detection results based on both the glimpse features and the attention modeling outputs of the previous stage. In practice, REGO can be easily embedded in representative DETR variants while maintaining their fully end-to-end training and inference pipelines. In particular, REGO helps Deformable DETR achieve 44.8 AP on the MSCOCO dataset with only 36 training epochs, compared with the first DETR and the Deformable DETR that require 500 and 50 epochs to achieve comparable performance, respectively. Experiments also show that REGO consistently boosts the performance of different DETR detectors by up to 7% relative gain at the same setting of 50 training epochs. Code is available via https://github.com/zhechen/Deformable-DETR-REGO.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO (Common Objects in Context)GFlops434REGO-Deformable DETR-X101
Object DetectionCOCO minivalAP5067.5REGO-Deformable DETR-X101
Object DetectionCOCO minivalAP7553.1REGO-Deformable DETR-X101
Object DetectionCOCO minivalAPL65REGO-Deformable DETR-X101
Object DetectionCOCO minivalAPM52.6REGO-Deformable DETR-X101
Object DetectionCOCO minivalAPS30REGO-Deformable DETR-X101
Object DetectionCOCO minivalbox AP49.1REGO-Deformable DETR-X101
3DCOCO (Common Objects in Context)GFlops434REGO-Deformable DETR-X101
3DCOCO minivalAP5067.5REGO-Deformable DETR-X101
3DCOCO minivalAP7553.1REGO-Deformable DETR-X101
3DCOCO minivalAPL65REGO-Deformable DETR-X101
3DCOCO minivalAPM52.6REGO-Deformable DETR-X101
3DCOCO minivalAPS30REGO-Deformable DETR-X101
3DCOCO minivalbox AP49.1REGO-Deformable DETR-X101
2D ClassificationCOCO (Common Objects in Context)GFlops434REGO-Deformable DETR-X101
2D ClassificationCOCO minivalAP5067.5REGO-Deformable DETR-X101
2D ClassificationCOCO minivalAP7553.1REGO-Deformable DETR-X101
2D ClassificationCOCO minivalAPL65REGO-Deformable DETR-X101
2D ClassificationCOCO minivalAPM52.6REGO-Deformable DETR-X101
2D ClassificationCOCO minivalAPS30REGO-Deformable DETR-X101
2D ClassificationCOCO minivalbox AP49.1REGO-Deformable DETR-X101
2D Object DetectionCOCO (Common Objects in Context)GFlops434REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalAP5067.5REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalAP7553.1REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalAPL65REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalAPM52.6REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalAPS30REGO-Deformable DETR-X101
2D Object DetectionCOCO minivalbox AP49.1REGO-Deformable DETR-X101
16kCOCO (Common Objects in Context)GFlops434REGO-Deformable DETR-X101
16kCOCO minivalAP5067.5REGO-Deformable DETR-X101
16kCOCO minivalAP7553.1REGO-Deformable DETR-X101
16kCOCO minivalAPL65REGO-Deformable DETR-X101
16kCOCO minivalAPM52.6REGO-Deformable DETR-X101
16kCOCO minivalAPS30REGO-Deformable DETR-X101
16kCOCO minivalbox AP49.1REGO-Deformable DETR-X101

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07