Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou, Zhirong Luan, Badong Chen

2024-04-30Domain Generalization

Abstract

Large pre-trained vision language models (VLMs) have shown impressive zero-shot ability on downstream tasks with manually designed prompt. To further adapt VLMs to downstream tasks, soft prompt is proposed to replace manually designed prompt, which undergoes fine-tuning based on specific domain data. Prior prompt learning methods primarily learn a fixed prompt or residuled prompt from training samples. However, the learned prompts lack diversity and ignore information about unseen domains. In this paper, we reframe the prompt learning framework from a generative perspective and propose a simple yet efficient method for the Domain Generalization (DG) task, namely Soft Prompt Generation (SPG). Specifically, SPG consists of a two-stage training phase and an inference phase. During the training phase, we introduce soft prompt label for each domain, aiming to incorporate the generative model domain knowledge. During the inference phase, the generator of the generative model is employed to obtain instance-specific soft prompts for the unseen target domain. Extensive experiments on five domain generalization benchmarks of three DG tasks demonstrate that SPG achieves state-of-the-art performance. The code is available at https://github.com/renytek13/Soft-Prompt-Generation-with-CGAN.

Results

Task	Dataset	Metric	Value	Model
Domain Adaptation	PACS	Average Accuracy	97	SPG (CLIP, ViT-B/16)
Domain Adaptation	PACS	Average Accuracy	92.8	SPG (CLIP, ResNet-50)
Domain Adaptation	Office-Home	Average Accuracy	83.6	SPG (CLIP, ViT-B/16)
Domain Adaptation	Office-Home	Average Accuracy	73.8	SPG (CLIP, ResNet-50)
Domain Adaptation	DomainNet	Average Accuracy	60.1	SPG (CLIP, ViT-B/16)
Domain Adaptation	DomainNet	Average Accuracy	50.1	SPG (CLIP, ResNet-50)
Domain Adaptation	VLCS	Average Accuracy	84	SPG (CLIP, ResNet-50)
Domain Adaptation	VLCS	Average Accuracy	82.4	SPG (CLIP, ViT-B/16)
Domain Adaptation	TerraIncognita	Average Accuracy	50.2	SPG (CLIP, ViT-B/16)
Domain Generalization	PACS	Average Accuracy	97	SPG (CLIP, ViT-B/16)
Domain Generalization	PACS	Average Accuracy	92.8	SPG (CLIP, ResNet-50)
Domain Generalization	Office-Home	Average Accuracy	83.6	SPG (CLIP, ViT-B/16)
Domain Generalization	Office-Home	Average Accuracy	73.8	SPG (CLIP, ResNet-50)
Domain Generalization	DomainNet	Average Accuracy	60.1	SPG (CLIP, ViT-B/16)
Domain Generalization	DomainNet	Average Accuracy	50.1	SPG (CLIP, ResNet-50)
Domain Generalization	VLCS	Average Accuracy	84	SPG (CLIP, ResNet-50)
Domain Generalization	VLCS	Average Accuracy	82.4	SPG (CLIP, ViT-B/16)
Domain Generalization	TerraIncognita	Average Accuracy	50.2	SPG (CLIP, ViT-B/16)

Abstract

Results

Task	Dataset	Metric	Value	Model
Domain Adaptation	PACS	Average Accuracy	97	SPG (CLIP, ViT-B/16)
Domain Adaptation	PACS	Average Accuracy	92.8	SPG (CLIP, ResNet-50)
Domain Adaptation	Office-Home	Average Accuracy	83.6	SPG (CLIP, ViT-B/16)
Domain Adaptation	Office-Home	Average Accuracy	73.8	SPG (CLIP, ResNet-50)
Domain Adaptation	DomainNet	Average Accuracy	60.1	SPG (CLIP, ViT-B/16)
Domain Adaptation	DomainNet	Average Accuracy	50.1	SPG (CLIP, ResNet-50)
Domain Adaptation	VLCS	Average Accuracy	84	SPG (CLIP, ResNet-50)
Domain Adaptation	VLCS	Average Accuracy	82.4	SPG (CLIP, ViT-B/16)
Domain Adaptation	TerraIncognita	Average Accuracy	50.2	SPG (CLIP, ViT-B/16)
Domain Generalization	PACS	Average Accuracy	97	SPG (CLIP, ViT-B/16)
Domain Generalization	PACS	Average Accuracy	92.8	SPG (CLIP, ResNet-50)
Domain Generalization	Office-Home	Average Accuracy	83.6	SPG (CLIP, ViT-B/16)
Domain Generalization	Office-Home	Average Accuracy	73.8	SPG (CLIP, ResNet-50)
Domain Generalization	DomainNet	Average Accuracy	60.1	SPG (CLIP, ViT-B/16)
Domain Generalization	DomainNet	Average Accuracy	50.1	SPG (CLIP, ResNet-50)
Domain Generalization	VLCS	Average Accuracy	84	SPG (CLIP, ResNet-50)
Domain Generalization	VLCS	Average Accuracy	82.4	SPG (CLIP, ViT-B/16)
Domain Generalization	TerraIncognita	Average Accuracy	50.2	SPG (CLIP, ViT-B/16)

Soft Prompt Generation for Domain Generalization

Abstract

Results

Related Papers

Soft Prompt Generation for Domain Generalization

Abstract

Results

Related Papers