Read-only Prompt Optimization for Vision-Language Few-shot Learning

Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyung Choi, Sanghyeok Lee, Hyunwoo J. Kim

2023-08-29ICCV 2023 1Few-Shot Learning Prompt Engineering Domain Generalization

Abstract

In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while keeping pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and generalization, especially in data-deficient settings. To address these issues, we propose a novel approach, Read-only Prompt Optimization (RPO). RPO leverages masked attention to prevent the internal representation shift in the pre-trained model. Further, to facilitate the optimization of RPO, the read-only prompts are initialized based on special tokens of the pre-trained model. Our extensive experiments demonstrate that RPO outperforms CLIP and CoCoOp in base-to-new generalization and domain generalization while displaying better robustness. Also, the proposed method achieves better generalization on extremely data-deficient settings, while improving parameter efficiency and computational overhead. Code is available at https://github.com/mlvlab/RPO.

Results

Task	Dataset	Metric	Value	Model
Prompt Engineering	Stanford Cars	Harmonic mean	74.69	RPO
Prompt Engineering	Oxford 102 Flower	Harmonic mean	84.5	RPO
Prompt Engineering	EuroSAT	Harmonic mean	76.79	RPO
Prompt Engineering	Oxford-IIIT Pet Dataset	Harmonic mean	96.05	RPO
Prompt Engineering	DTD	Harmonic mean	68.61	RPO
Prompt Engineering	UCF101	Harmonic mean	79.34	RPO
Prompt Engineering	Food-101	Harmonic mean	90.58	RPO
Prompt Engineering	Caltech-101	Harmonic mean	96.03	RPO
Prompt Engineering	ImageNet	Harmonic mean	74	RPO
Prompt Engineering	FGVC-Aircraft	Harmonic mean	35.7	RPO
Prompt Engineering	SUN397	Harmonic mean	79.18	RPO

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Abstract

Results

Related Papers

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Abstract

Results

Related Papers