TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Consistency-guided Prompt Learning for Vision-Language Mod...

Consistency-guided Prompt Learning for Vision-Language Models

Shuvendu Roy, Ali Etemad

2023-06-01Few-Shot LearningPrompt EngineeringDomain Generalization
PaperPDFCodeCode(official)

Abstract

We propose Consistency-guided Prompt learning (CoPrompt), a new fine-tuning method for vision-language models. Our approach improves the generalization of large foundation models when fine-tuned on downstream tasks in a few-shot setting. The basic idea of CoPrompt is to enforce a consistency constraint in the prediction of the trainable and pre-trained models to prevent overfitting on the downstream task. Additionally, we introduce the following two components into our consistency constraint to further boost the performance: enforcing consistency on two perturbed inputs and combining two dominant paradigms of tuning, prompting and adapter. Enforcing consistency on perturbed input serves to further regularize the consistency constraint, thereby improving generalization. Moreover, the integration of adapters and prompts not only enhances performance on downstream tasks but also offers increased tuning flexibility in both input and output spaces. This facilitates more effective adaptation to downstream tasks in a few-shot learning setting. Experiments show that CoPrompt outperforms existing methods on a range of evaluation suites, including base-to-novel generalization, domain generalization, and cross-dataset evaluation. On generalization, CoPrompt improves the state-of-the-art on zero-shot tasks and the overall harmonic mean over 11 datasets. Detailed ablation studies show the effectiveness of each of the components in CoPrompt. We make our code available at https://github.com/ShuvenduRoy/CoPrompt.

Results

TaskDatasetMetricValueModel
Prompt EngineeringImageNet-RTop-1 accuracy %77.51CoPrompt
Prompt EngineeringStanford CarsHarmonic mean75.66CoPrompt
Prompt EngineeringOxford 102 FlowerHarmonic mean85.71CoPrompt
Prompt EngineeringEuroSATHarmonic mean85.84CoPrompt
Prompt EngineeringOxford-IIIT Pet DatasetHarmonic mean96.87CoPrompt
Prompt EngineeringImageNet-STop-1 accuracy %49.43CoPrompt
Prompt EngineeringDTDHarmonic mean72.79CoPrompt
Prompt EngineeringUCF101Harmonic mean83.07CoPrompt
Prompt EngineeringFood-101Harmonic mean91.4CoPrompt
Prompt EngineeringCaltech-101Harmonic mean96.55CoPrompt
Prompt EngineeringImageNetHarmonic mean74.33CoPrompt
Prompt EngineeringFGVC-AircraftHarmonic mean39.76CoPrompt
Prompt EngineeringSUN397Harmonic mean81.31CoPrompt
Prompt EngineeringImageNet-ATop-1 accuracy %50.5CoPrompt

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Leveraging Language Prior for Infrared Small Target Detection2025-07-17Emotional Support with LLM-based Empathetic Dialogue Generation2025-07-17Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing2025-07-16Prompt Engineering in Segment Anything Model: Methodologies, Applications, and Emerging Challenges2025-07-13From Physics to Foundation Models: A Review of AI-Driven Quantitative Remote Sensing Inversion2025-07-11