TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Cross-Domain Few-Shot Object Detection via Enhanced Open-S...

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan, Lian Huai, Xingyu Qiu, Zeyu Shangguan, Tong Liu, Yanwei Fu, Luc van Gool, Xingqun Jiang

2024-02-05Few-Shot Object DetectionCross-Domain Few-ShotOpen Vocabulary Object Detectionobject-detectionCross-Domain Few-Shot Object DetectionObject Detection
PaperPDFCode(official)Code

Abstract

This paper studies the challenging cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. While transformer-based open-set detectors, such as DE-ViT, show promise in traditional few-shot object detection, their generalization to CD-FSOD remains unclear: 1) can such open-set detection methods easily generalize to CD-FSOD? 2) If not, how can models be enhanced when facing huge domain gaps? To answer the first question, we employ measures including style, inter-class variance (ICV), and indefinable boundaries (IB) to understand the domain gap. Based on these measures, we establish a new benchmark named CD-FSOD to evaluate object detection methods, revealing that most of the current approaches fail to generalize across domains. Technically, we observe that the performance decline is associated with our proposed measures: style, ICV, and IB. Consequently, we propose several novel modules to address these issues. First, the learnable instance features align initial fixed instances with target categories, enhancing feature distinctiveness. Second, the instance reweighting module assigns higher importance to high-quality instances with slight IB. Third, the domain prompter encourages features resilient to different styles by synthesizing imaginary domains without altering semantic contents. These techniques collectively contribute to the development of the Cross-Domain Vision Transformer for CD-FSOD (CD-ViTO), significantly improving upon the base DE-ViT. Experimental results validate the efficacy of our model.

Results

TaskDatasetMetricValueModel
Object DetectionMS-COCO (30-shot)AP35.9CD-ViTO
Object DetectionMS-COCO (10-shot)AP35.3CD-ViTO
Object DetectionArtaxor mAP60.5CD-ViTO
Object DetectionNEU-DETmAP12.8CD-ViTO
Object DetectionDIORmAP30.8CD-ViTO
Object DetectionClipark1k mAP44.3CD-ViTO
Object DetectionDeepFishmAP22.3CD-ViTO
Object DetectionUODDmAP7CD-ViTO
3DMS-COCO (30-shot)AP35.9CD-ViTO
3DMS-COCO (10-shot)AP35.3CD-ViTO
3DArtaxor mAP60.5CD-ViTO
3DNEU-DETmAP12.8CD-ViTO
3DDIORmAP30.8CD-ViTO
3DClipark1k mAP44.3CD-ViTO
3DDeepFishmAP22.3CD-ViTO
3DUODDmAP7CD-ViTO
Few-Shot Object DetectionMS-COCO (30-shot)AP35.9CD-ViTO
Few-Shot Object DetectionMS-COCO (10-shot)AP35.3CD-ViTO
Few-Shot Object DetectionArtaxor mAP60.5CD-ViTO
Few-Shot Object DetectionNEU-DETmAP12.8CD-ViTO
Few-Shot Object DetectionDIORmAP30.8CD-ViTO
Few-Shot Object DetectionClipark1k mAP44.3CD-ViTO
Few-Shot Object DetectionDeepFishmAP22.3CD-ViTO
Few-Shot Object DetectionUODDmAP7CD-ViTO
2D ClassificationMS-COCO (30-shot)AP35.9CD-ViTO
2D ClassificationMS-COCO (10-shot)AP35.3CD-ViTO
2D ClassificationArtaxor mAP60.5CD-ViTO
2D ClassificationNEU-DETmAP12.8CD-ViTO
2D ClassificationDIORmAP30.8CD-ViTO
2D ClassificationClipark1k mAP44.3CD-ViTO
2D ClassificationDeepFishmAP22.3CD-ViTO
2D ClassificationUODDmAP7CD-ViTO
2D Object DetectionMS-COCO (30-shot)AP35.9CD-ViTO
2D Object DetectionMS-COCO (10-shot)AP35.3CD-ViTO
2D Object DetectionArtaxor mAP60.5CD-ViTO
2D Object DetectionNEU-DETmAP12.8CD-ViTO
2D Object DetectionDIORmAP30.8CD-ViTO
2D Object DetectionClipark1k mAP44.3CD-ViTO
2D Object DetectionDeepFishmAP22.3CD-ViTO
2D Object DetectionUODDmAP7CD-ViTO
16kMS-COCO (30-shot)AP35.9CD-ViTO
16kMS-COCO (10-shot)AP35.3CD-ViTO
16kArtaxor mAP60.5CD-ViTO
16kNEU-DETmAP12.8CD-ViTO
16kDIORmAP30.8CD-ViTO
16kClipark1k mAP44.3CD-ViTO
16kDeepFishmAP22.3CD-ViTO
16kUODDmAP7CD-ViTO

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07