TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Enhance Then Search: An Augmentation-Search Strategy with ...

Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection

Jiancheng Pan, Yanxing Liu, Xiao He, Long Peng, Jiahao Li, Yuze Sun, Xiaomeng Huang

2025-04-06Few-Shot Object DetectionImage AugmentationData AugmentationNavigateDomain GeneralizationCross-Domain Few-Shotobject-detectionCross-Domain Few-Shot Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Foundation models pretrained on extensive datasets, such as GroundingDINO and LAE-DINO, have performed remarkably in the cross-domain few-shot object detection (CD-FSOD) task. Through rigorous few-shot training, we found that the integration of image-based data augmentation techniques and grid-based sub-domain search strategy significantly enhances the performance of these foundation models. Building upon GroundingDINO, we employed several widely used image augmentation methods and established optimization objectives to effectively navigate the expansive domain space in search of optimal sub-domains. This approach facilitates efficient few-shot object detection and introduces an approach to solving the CD-FSOD problem by efficiently searching for the optimal parameter configuration from the foundation model. Our findings substantially advance the practical deployment of vision-language models in data-scarce environments, offering critical insights into optimizing their cross-domain generalization capabilities without labor-intensive retraining. Code is available at https://github.com/jaychempan/ETS.

Results

TaskDatasetMetricValueModel
Object DetectionArtaxor mAP71.2ETS
Object DetectionNEU-DETmAP26.1ETS
Object DetectionDIORmAP37.5ETS
Object DetectionClipark1k mAP61.5ETS
Object DetectionDeepFishmAP44.1ETS
Object DetectionUODDmAP29.8ETS
3DArtaxor mAP71.2ETS
3DNEU-DETmAP26.1ETS
3DDIORmAP37.5ETS
3DClipark1k mAP61.5ETS
3DDeepFishmAP44.1ETS
3DUODDmAP29.8ETS
Few-Shot Object DetectionArtaxor mAP71.2ETS
Few-Shot Object DetectionNEU-DETmAP26.1ETS
Few-Shot Object DetectionDIORmAP37.5ETS
Few-Shot Object DetectionClipark1k mAP61.5ETS
Few-Shot Object DetectionDeepFishmAP44.1ETS
Few-Shot Object DetectionUODDmAP29.8ETS
2D ClassificationArtaxor mAP71.2ETS
2D ClassificationNEU-DETmAP26.1ETS
2D ClassificationDIORmAP37.5ETS
2D ClassificationClipark1k mAP61.5ETS
2D ClassificationDeepFishmAP44.1ETS
2D ClassificationUODDmAP29.8ETS
2D Object DetectionArtaxor mAP71.2ETS
2D Object DetectionNEU-DETmAP26.1ETS
2D Object DetectionDIORmAP37.5ETS
2D Object DetectionClipark1k mAP61.5ETS
2D Object DetectionDeepFishmAP44.1ETS
2D Object DetectionUODDmAP29.8ETS
16kArtaxor mAP71.2ETS
16kNEU-DETmAP26.1ETS
16kDIORmAP37.5ETS
16kClipark1k mAP61.5ETS
16kDeepFishmAP44.1ETS
16kUODDmAP29.8ETS

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17