TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Soft Prompt Generation for Domain Generalization

Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou, Zhirong Luan, Badong Chen

2024-04-30Domain Generalization
PaperPDFCode(official)

Abstract

Large pre-trained vision language models (VLMs) have shown impressive zero-shot ability on downstream tasks with manually designed prompt. To further adapt VLMs to downstream tasks, soft prompt is proposed to replace manually designed prompt, which undergoes fine-tuning based on specific domain data. Prior prompt learning methods primarily learn a fixed prompt or residuled prompt from training samples. However, the learned prompts lack diversity and ignore information about unseen domains. In this paper, we reframe the prompt learning framework from a generative perspective and propose a simple yet efficient method for the Domain Generalization (DG) task, namely Soft Prompt Generation (SPG). Specifically, SPG consists of a two-stage training phase and an inference phase. During the training phase, we introduce soft prompt label for each domain, aiming to incorporate the generative model domain knowledge. During the inference phase, the generator of the generative model is employed to obtain instance-specific soft prompts for the unseen target domain. Extensive experiments on five domain generalization benchmarks of three DG tasks demonstrate that SPG achieves state-of-the-art performance. The code is available at https://github.com/renytek13/Soft-Prompt-Generation-with-CGAN.

Results

TaskDatasetMetricValueModel
Domain AdaptationPACSAverage Accuracy97SPG (CLIP, ViT-B/16)
Domain AdaptationPACSAverage Accuracy92.8SPG (CLIP, ResNet-50)
Domain AdaptationOffice-HomeAverage Accuracy83.6SPG (CLIP, ViT-B/16)
Domain AdaptationOffice-HomeAverage Accuracy73.8SPG (CLIP, ResNet-50)
Domain AdaptationDomainNetAverage Accuracy60.1SPG (CLIP, ViT-B/16)
Domain AdaptationDomainNetAverage Accuracy50.1SPG (CLIP, ResNet-50)
Domain AdaptationVLCSAverage Accuracy84SPG (CLIP, ResNet-50)
Domain AdaptationVLCSAverage Accuracy82.4SPG (CLIP, ViT-B/16)
Domain AdaptationTerraIncognitaAverage Accuracy50.2SPG (CLIP, ViT-B/16)
Domain GeneralizationPACSAverage Accuracy97SPG (CLIP, ViT-B/16)
Domain GeneralizationPACSAverage Accuracy92.8SPG (CLIP, ResNet-50)
Domain GeneralizationOffice-HomeAverage Accuracy83.6SPG (CLIP, ViT-B/16)
Domain GeneralizationOffice-HomeAverage Accuracy73.8SPG (CLIP, ResNet-50)
Domain GeneralizationDomainNetAverage Accuracy60.1SPG (CLIP, ViT-B/16)
Domain GeneralizationDomainNetAverage Accuracy50.1SPG (CLIP, ResNet-50)
Domain GeneralizationVLCSAverage Accuracy84SPG (CLIP, ResNet-50)
Domain GeneralizationVLCSAverage Accuracy82.4SPG (CLIP, ViT-B/16)
Domain GeneralizationTerraIncognitaAverage Accuracy50.2SPG (CLIP, ViT-B/16)

Related Papers

Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing2025-07-16From Physics to Foundation Models: A Review of AI-Driven Quantitative Remote Sensing Inversion2025-07-11Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion2025-07-08Prompt-Free Conditional Diffusion for Multi-object Image Augmentation2025-07-08Integrated Structural Prompt Learning for Vision-Language Models2025-07-08