TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PromptStyler: Prompt-driven Style Generation for Source-fr...

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak

2023-07-27ICCV 2023 1Multi-modal ClassificationImage ClassificationZero-Shot Image ClassificationDomain GeneralizationMultimodal Deep LearningZero-Shot LearningOut-of-Distribution Generalization
PaperPDFCode

Abstract

In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. The proposed method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning style word vectors, we train a linear classifier using synthesized style-content features. PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, even though it does not require any images for training.

Results

TaskDatasetMetricValueModel
Domain AdaptationPACSAverage Accuracy98.6PromptStyler (CLIP, ViT-L/14)
Domain AdaptationPACSAverage Accuracy97.2PromptStyler (CLIP, ViT-B/16)
Domain AdaptationPACSAverage Accuracy93.2PromptStyler (CLIP, ResNet-50)
Domain AdaptationOffice-HomeAverage Accuracy89.1PromptStyler (CLIP, ViT-L/14)
Domain AdaptationOffice-HomeAverage Accuracy83.6PromptStyler (CLIP, ViT-B/16)
Domain AdaptationOffice-HomeAverage Accuracy73.6PromptStyler (CLIP, ResNet-50)
Domain AdaptationDomainNetAverage Accuracy65.5PromptStyler (CLIP, ViT-L/14)
Domain AdaptationDomainNetAverage Accuracy59.4PromptStyler (CLIP, ViT-B/16)
Domain AdaptationDomainNetAverage Accuracy49.5PromptStyler (CLIP, ResNet-50)
Domain AdaptationVLCSAverage Accuracy82.9PromptStyler (CLIP, ViT-B/16)
Domain AdaptationVLCSAverage Accuracy82.4PromptStyler (CLIP, ViT-L/14)
Domain AdaptationVLCSAverage Accuracy82.3PromptStyler (CLIP, ResNet-50)
Domain GeneralizationPACSAverage Accuracy98.6PromptStyler (CLIP, ViT-L/14)
Domain GeneralizationPACSAverage Accuracy97.2PromptStyler (CLIP, ViT-B/16)
Domain GeneralizationPACSAverage Accuracy93.2PromptStyler (CLIP, ResNet-50)
Domain GeneralizationOffice-HomeAverage Accuracy89.1PromptStyler (CLIP, ViT-L/14)
Domain GeneralizationOffice-HomeAverage Accuracy83.6PromptStyler (CLIP, ViT-B/16)
Domain GeneralizationOffice-HomeAverage Accuracy73.6PromptStyler (CLIP, ResNet-50)
Domain GeneralizationDomainNetAverage Accuracy65.5PromptStyler (CLIP, ViT-L/14)
Domain GeneralizationDomainNetAverage Accuracy59.4PromptStyler (CLIP, ViT-B/16)
Domain GeneralizationDomainNetAverage Accuracy49.5PromptStyler (CLIP, ResNet-50)
Domain GeneralizationVLCSAverage Accuracy82.9PromptStyler (CLIP, ViT-B/16)
Domain GeneralizationVLCSAverage Accuracy82.4PromptStyler (CLIP, ViT-L/14)
Domain GeneralizationVLCSAverage Accuracy82.3PromptStyler (CLIP, ResNet-50)

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17