PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak

2023-07-27ICCV 2023 1Multi-modal Classification Image Classification Zero-Shot Image Classification Domain Generalization Multimodal Deep Learning Zero-Shot Learning Out-of-Distribution Generalization

Paper PDF Code

Abstract

In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. The proposed method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning style word vectors, we train a linear classifier using synthesized style-content features. PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, even though it does not require any images for training.

Results

Task	Dataset	Metric	Value	Model
Domain Adaptation	PACS	Average Accuracy	98.6	PromptStyler (CLIP, ViT-L/14)
Domain Adaptation	PACS	Average Accuracy	97.2	PromptStyler (CLIP, ViT-B/16)
Domain Adaptation	PACS	Average Accuracy	93.2	PromptStyler (CLIP, ResNet-50)
Domain Adaptation	Office-Home	Average Accuracy	89.1	PromptStyler (CLIP, ViT-L/14)
Domain Adaptation	Office-Home	Average Accuracy	83.6	PromptStyler (CLIP, ViT-B/16)
Domain Adaptation	Office-Home	Average Accuracy	73.6	PromptStyler (CLIP, ResNet-50)
Domain Adaptation	DomainNet	Average Accuracy	65.5	PromptStyler (CLIP, ViT-L/14)
Domain Adaptation	DomainNet	Average Accuracy	59.4	PromptStyler (CLIP, ViT-B/16)
Domain Adaptation	DomainNet	Average Accuracy	49.5	PromptStyler (CLIP, ResNet-50)
Domain Adaptation	VLCS	Average Accuracy	82.9	PromptStyler (CLIP, ViT-B/16)
Domain Adaptation	VLCS	Average Accuracy	82.4	PromptStyler (CLIP, ViT-L/14)
Domain Adaptation	VLCS	Average Accuracy	82.3	PromptStyler (CLIP, ResNet-50)
Domain Generalization	PACS	Average Accuracy	98.6	PromptStyler (CLIP, ViT-L/14)
Domain Generalization	PACS	Average Accuracy	97.2	PromptStyler (CLIP, ViT-B/16)
Domain Generalization	PACS	Average Accuracy	93.2	PromptStyler (CLIP, ResNet-50)
Domain Generalization	Office-Home	Average Accuracy	89.1	PromptStyler (CLIP, ViT-L/14)
Domain Generalization	Office-Home	Average Accuracy	83.6	PromptStyler (CLIP, ViT-B/16)
Domain Generalization	Office-Home	Average Accuracy	73.6	PromptStyler (CLIP, ResNet-50)
Domain Generalization	DomainNet	Average Accuracy	65.5	PromptStyler (CLIP, ViT-L/14)
Domain Generalization	DomainNet	Average Accuracy	59.4	PromptStyler (CLIP, ViT-B/16)
Domain Generalization	DomainNet	Average Accuracy	49.5	PromptStyler (CLIP, ResNet-50)
Domain Generalization	VLCS	Average Accuracy	82.9	PromptStyler (CLIP, ViT-B/16)
Domain Generalization	VLCS	Average Accuracy	82.4	PromptStyler (CLIP, ViT-L/14)
Domain Generalization	VLCS	Average Accuracy	82.3	PromptStyler (CLIP, ResNet-50)

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Abstract

Results

Related Papers

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Abstract

Results

Related Papers