TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Exploring CLIP for Assessing the Look and Feel of Images

Exploring CLIP for Assessing the Look and Feel of Images

Jianyi Wang, Kelvin C. K. Chan, Chen Change Loy

2022-07-25Video Quality AssessmentImage Quality AssessmentNo-Reference Image Quality Assessment
PaperPDFCode(official)

Abstract

Measuring the perception of visual content is a long-standing problem in computer vision. Many mathematical models have been developed to evaluate the look or quality of an image. Despite the effectiveness of such tools in quantifying degradations such as noise and blurriness levels, such quantification is loosely coupled with human language. When it comes to more abstract perception about the feel of visual content, existing methods can only rely on supervised models that are explicitly trained with labeled data collected via laborious user study. In this paper, we go beyond the conventional paradigms by exploring the rich visual language prior encapsulated in Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. In particular, we discuss effective prompt designs and show an effective prompt pairing strategy to harness the prior. We also provide extensive experiments on controlled datasets and Image Quality Assessment (IQA) benchmarks. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments. Code is avaliable at https://github.com/IceClear/CLIP-IQA.

Results

TaskDatasetMetricValueModel
Video UnderstandingMSU SR-QA DatasetKLCC0.52628ClipIQA+ ResNet50
Video UnderstandingMSU SR-QA DatasetPLCC0.65154ClipIQA+ ResNet50
Video UnderstandingMSU SR-QA DatasetSROCC0.65713ClipIQA+ ResNet50
Video UnderstandingMSU SR-QA DatasetKLCC0.49417ClipIQA
Video UnderstandingMSU SR-QA DatasetPLCC0.58944ClipIQA
Video UnderstandingMSU SR-QA DatasetSROCC0.60808ClipIQA
Video UnderstandingMSU SR-QA DatasetKLCC0.69774ClipIQA+
Video UnderstandingMSU SR-QA DatasetPLCC0.71808ClipIQA+
Video UnderstandingMSU SR-QA DatasetSROCC0.56875ClipIQA+
Video UnderstandingMSU SR-QA DatasetKLCC0.38794ClipIQA+ ViT-L-14
Video UnderstandingMSU SR-QA DatasetPLCC0.50379ClipIQA+ ViT-L-14
Video UnderstandingMSU SR-QA DatasetSROCC0.49881ClipIQA+ ViT-L-14
Video Quality AssessmentMSU SR-QA DatasetKLCC0.52628ClipIQA+ ResNet50
Video Quality AssessmentMSU SR-QA DatasetPLCC0.65154ClipIQA+ ResNet50
Video Quality AssessmentMSU SR-QA DatasetSROCC0.65713ClipIQA+ ResNet50
Video Quality AssessmentMSU SR-QA DatasetKLCC0.49417ClipIQA
Video Quality AssessmentMSU SR-QA DatasetPLCC0.58944ClipIQA
Video Quality AssessmentMSU SR-QA DatasetSROCC0.60808ClipIQA
Video Quality AssessmentMSU SR-QA DatasetKLCC0.69774ClipIQA+
Video Quality AssessmentMSU SR-QA DatasetPLCC0.71808ClipIQA+
Video Quality AssessmentMSU SR-QA DatasetSROCC0.56875ClipIQA+
Video Quality AssessmentMSU SR-QA DatasetKLCC0.38794ClipIQA+ ViT-L-14
Video Quality AssessmentMSU SR-QA DatasetPLCC0.50379ClipIQA+ ViT-L-14
Video Quality AssessmentMSU SR-QA DatasetSROCC0.49881ClipIQA+ ViT-L-14
Image Quality AssessmentUHD-IQAPLCC0.709CLIP-IQA+
Image Quality AssessmentUHD-IQASRCC0.747CLIP-IQA+
VideoMSU SR-QA DatasetKLCC0.52628ClipIQA+ ResNet50
VideoMSU SR-QA DatasetPLCC0.65154ClipIQA+ ResNet50
VideoMSU SR-QA DatasetSROCC0.65713ClipIQA+ ResNet50
VideoMSU SR-QA DatasetKLCC0.49417ClipIQA
VideoMSU SR-QA DatasetPLCC0.58944ClipIQA
VideoMSU SR-QA DatasetSROCC0.60808ClipIQA
VideoMSU SR-QA DatasetKLCC0.69774ClipIQA+
VideoMSU SR-QA DatasetPLCC0.71808ClipIQA+
VideoMSU SR-QA DatasetSROCC0.56875ClipIQA+
VideoMSU SR-QA DatasetKLCC0.38794ClipIQA+ ViT-L-14
VideoMSU SR-QA DatasetPLCC0.50379ClipIQA+ ViT-L-14
VideoMSU SR-QA DatasetSROCC0.49881ClipIQA+ ViT-L-14
No-Reference Image Quality AssessmentUHD-IQAPLCC0.709CLIP-IQA+
No-Reference Image Quality AssessmentUHD-IQASRCC0.747CLIP-IQA+

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment2025-07-17Text-Visual Semantic Constrained AI-Generated Image Quality Assessment2025-07-144KAgent: Agentic Any Image to 4K Super-Resolution2025-07-09Bridging Video Quality Scoring and Justification via Large Multimodal Models2025-06-26FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment2025-06-25MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment2025-06-25