Semi-Supervised Domain Generalization for Object Detection via Language-Guided Feature Alignment

Sina Malakouti, Adriana Kovashka

2023-09-24BMVC 2023 11Descriptive Domain Generalization object-detection Object Detection Domain Adaptation

Abstract

Existing domain adaptation (DA) and generalization (DG) methods in object detection enforce feature alignment in the visual space but face challenges like object appearance variability and scene complexity, which make it difficult to distinguish between objects and achieve accurate detection. In this paper, we are the first to address the problem of semi-supervised domain generalization by exploring vision-language pre-training and enforcing feature alignment through the language space. We employ a novel Cross-Domain Descriptive Multi-Scale Learning (CDDMSL) aiming to maximize the agreement between descriptions of an image presented with different domain-specific characteristics in the embedding space. CDDMSL significantly outperforms existing methods, achieving 11.7% and 7.5% improvement in DG and DA settings, respectively. Comprehensive analysis and ablation studies confirm the effectiveness of our method, positioning CDDMSL as a promising approach for domain generalization in object detection tasks.

Results

Task	Dataset	Metric	Value	Model
Object Detection	PASCAL VOC to Watercolor2k	mAp	49.7	CDDMSL
Object Detection	BDD100K	MAP	27.1	CDDMSL
Object Detection	Watercolor2k	MAP	49.8	CDDMSL
Object Detection	Comic2k	mAP	45.9	CDDMSL
Object Detection	PASCAL VOC to Comic2k	mAP	46.3	CDDMSL
Object Detection	Pascal VOC to Clipart1K	mAP	40.4	CDDMSL
Object Detection	Cityscapes to Foggy Cityscapes	mAP	54.3	CDDMSL
Object Detection	Clipart1k	MAP	39.8	CDDMSL
3D	PASCAL VOC to Watercolor2k	mAp	49.7	CDDMSL
3D	BDD100K	MAP	27.1	CDDMSL
3D	Watercolor2k	MAP	49.8	CDDMSL
3D	Comic2k	mAP	45.9	CDDMSL
3D	PASCAL VOC to Comic2k	mAP	46.3	CDDMSL
3D	Pascal VOC to Clipart1K	mAP	40.4	CDDMSL
3D	Cityscapes to Foggy Cityscapes	mAP	54.3	CDDMSL
3D	Clipart1k	MAP	39.8	CDDMSL
2D Classification	PASCAL VOC to Watercolor2k	mAp	49.7	CDDMSL
2D Classification	BDD100K	MAP	27.1	CDDMSL
2D Classification	Watercolor2k	MAP	49.8	CDDMSL
2D Classification	Comic2k	mAP	45.9	CDDMSL
2D Classification	PASCAL VOC to Comic2k	mAP	46.3	CDDMSL
2D Classification	Pascal VOC to Clipart1K	mAP	40.4	CDDMSL
2D Classification	Cityscapes to Foggy Cityscapes	mAP	54.3	CDDMSL
2D Classification	Clipart1k	MAP	39.8	CDDMSL
2D Object Detection	PASCAL VOC to Watercolor2k	mAp	49.7	CDDMSL
2D Object Detection	BDD100K	MAP	27.1	CDDMSL
2D Object Detection	Watercolor2k	MAP	49.8	CDDMSL
2D Object Detection	Comic2k	mAP	45.9	CDDMSL
2D Object Detection	PASCAL VOC to Comic2k	mAP	46.3	CDDMSL
2D Object Detection	Pascal VOC to Clipart1K	mAP	40.4	CDDMSL
2D Object Detection	Cityscapes to Foggy Cityscapes	mAP	54.3	CDDMSL
2D Object Detection	Clipart1k	MAP	39.8	CDDMSL
16k	PASCAL VOC to Watercolor2k	mAp	49.7	CDDMSL
16k	BDD100K	MAP	27.1	CDDMSL
16k	Watercolor2k	MAP	49.8	CDDMSL
16k	Comic2k	mAP	45.9	CDDMSL
16k	PASCAL VOC to Comic2k	mAP	46.3	CDDMSL
16k	Pascal VOC to Clipart1K	mAP	40.4	CDDMSL
16k	Cityscapes to Foggy Cityscapes	mAP	54.3	CDDMSL
16k	Clipart1k	MAP	39.8	CDDMSL

Semi-Supervised Domain Generalization for Object Detection via Language-Guided Feature Alignment

Abstract

Results

Related Papers

Semi-Supervised Domain Generalization for Object Detection via Language-Guided Feature Alignment

Abstract

Results

Related Papers