TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/TOPIQ: A Top-down Approach from Semantics to Distortions f...

TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality Assessment

Chaofeng Chen, Jiadi Mo, Jingwen Hou, HaoNing Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin

2023-08-06Local DistortionVideo Quality AssessmentImage Quality AssessmentNo-Reference Image Quality Assessment
PaperPDFCode(official)

Abstract

Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (\ie, multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-scale features, and neglect their possibly complex relationship and interaction. In contrast, humans typically first form a global impression to locate important regions and then focus on local details in those regions. We therefore propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions, named as \emph{TOPIQ}. Our approach to IQA involves the design of a heuristic coarse-to-fine network (CFANet) that leverages multi-scale features and progressively propagates multi-level semantic information to low-level representations in a top-down manner. A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features guided by higher level features. This mechanism emphasizes active semantic regions for low-level distortions, thereby improving performance. CFANet can be used for both Full-Reference (FR) and No-Reference (NR) IQA. We use ResNet50 as its backbone and demonstrate that CFANet achieves better or competitive performance on most public FR and NR benchmarks compared with state-of-the-art methods based on vision transformers, while being much more efficient (with only ${\sim}13\%$ FLOPS of the current best FR method). Codes are released at \url{https://github.com/chaofengc/IQA-PyTorch}.

Results

TaskDatasetMetricValueModel
Video UnderstandingMSU SR-QA DatasetKLCC0.5314TOPIQ trained on SPAQ (NR)
Video UnderstandingMSU SR-QA DatasetPLCC0.60905TOPIQ trained on SPAQ (NR)
Video UnderstandingMSU SR-QA DatasetSROCC0.64923TOPIQ trained on SPAQ (NR)
Video UnderstandingMSU SR-QA DatasetKLCC0.5067TOPIQ
Video UnderstandingMSU SR-QA DatasetPLCC0.57674TOPIQ
Video UnderstandingMSU SR-QA DatasetSROCC0.62715TOPIQ
Video UnderstandingMSU SR-QA DatasetKLCC0.48428TOPIQ FACE
Video UnderstandingMSU SR-QA DatasetPLCC0.58949TOPIQ FACE
Video UnderstandingMSU SR-QA DatasetSROCC0.59564TOPIQ FACE
Video UnderstandingMSU SR-QA DatasetKLCC0.46217TOPIQ
Video UnderstandingMSU SR-QA DatasetPLCC0.57955TOPIQ
Video UnderstandingMSU SR-QA DatasetSROCC0.57341TOPIQ
Video UnderstandingMSU SR-QA DatasetKLCC0.42811TOPIQ trained on PIPAL
Video UnderstandingMSU SR-QA DatasetPLCC0.57564TOPIQ trained on PIPAL
Video UnderstandingMSU SR-QA DatasetSROCC0.55568TOPIQ trained on PIPAL
Video UnderstandingMSU SR-QA DatasetKLCC0.40663TOPIQ (IAA)
Video UnderstandingMSU SR-QA DatasetPLCC0.51061TOPIQ (IAA)
Video UnderstandingMSU SR-QA DatasetSROCC0.51687TOPIQ (IAA)
Video UnderstandingMSU SR-QA DatasetKLCC0.28473TOPIQ + Res50 (IAA)
Video UnderstandingMSU SR-QA DatasetPLCC0.34TOPIQ + Res50 (IAA)
Video UnderstandingMSU SR-QA DatasetSROCC0.36204TOPIQ + Res50 (IAA)
Video UnderstandingMSU SR-QA DatasetKLCC0.26774TOPIQ trained on FLIVE
Video UnderstandingMSU SR-QA DatasetPLCC0.3394TOPIQ trained on FLIVE
Video UnderstandingMSU SR-QA DatasetSROCC0.34092TOPIQ trained on FLIVE
Video Quality AssessmentMSU SR-QA DatasetKLCC0.5314TOPIQ trained on SPAQ (NR)
Video Quality AssessmentMSU SR-QA DatasetPLCC0.60905TOPIQ trained on SPAQ (NR)
Video Quality AssessmentMSU SR-QA DatasetSROCC0.64923TOPIQ trained on SPAQ (NR)
Video Quality AssessmentMSU SR-QA DatasetKLCC0.5067TOPIQ
Video Quality AssessmentMSU SR-QA DatasetPLCC0.57674TOPIQ
Video Quality AssessmentMSU SR-QA DatasetSROCC0.62715TOPIQ
Video Quality AssessmentMSU SR-QA DatasetKLCC0.48428TOPIQ FACE
Video Quality AssessmentMSU SR-QA DatasetPLCC0.58949TOPIQ FACE
Video Quality AssessmentMSU SR-QA DatasetSROCC0.59564TOPIQ FACE
Video Quality AssessmentMSU SR-QA DatasetKLCC0.46217TOPIQ
Video Quality AssessmentMSU SR-QA DatasetPLCC0.57955TOPIQ
Video Quality AssessmentMSU SR-QA DatasetSROCC0.57341TOPIQ
Video Quality AssessmentMSU SR-QA DatasetKLCC0.42811TOPIQ trained on PIPAL
Video Quality AssessmentMSU SR-QA DatasetPLCC0.57564TOPIQ trained on PIPAL
Video Quality AssessmentMSU SR-QA DatasetSROCC0.55568TOPIQ trained on PIPAL
Video Quality AssessmentMSU SR-QA DatasetKLCC0.40663TOPIQ (IAA)
Video Quality AssessmentMSU SR-QA DatasetPLCC0.51061TOPIQ (IAA)
Video Quality AssessmentMSU SR-QA DatasetSROCC0.51687TOPIQ (IAA)
Video Quality AssessmentMSU SR-QA DatasetKLCC0.28473TOPIQ + Res50 (IAA)
Video Quality AssessmentMSU SR-QA DatasetPLCC0.34TOPIQ + Res50 (IAA)
Video Quality AssessmentMSU SR-QA DatasetSROCC0.36204TOPIQ + Res50 (IAA)
Video Quality AssessmentMSU SR-QA DatasetKLCC0.26774TOPIQ trained on FLIVE
Video Quality AssessmentMSU SR-QA DatasetPLCC0.3394TOPIQ trained on FLIVE
Video Quality AssessmentMSU SR-QA DatasetSROCC0.34092TOPIQ trained on FLIVE
VideoMSU SR-QA DatasetKLCC0.5314TOPIQ trained on SPAQ (NR)
VideoMSU SR-QA DatasetPLCC0.60905TOPIQ trained on SPAQ (NR)
VideoMSU SR-QA DatasetSROCC0.64923TOPIQ trained on SPAQ (NR)
VideoMSU SR-QA DatasetKLCC0.5067TOPIQ
VideoMSU SR-QA DatasetPLCC0.57674TOPIQ
VideoMSU SR-QA DatasetSROCC0.62715TOPIQ
VideoMSU SR-QA DatasetKLCC0.48428TOPIQ FACE
VideoMSU SR-QA DatasetPLCC0.58949TOPIQ FACE
VideoMSU SR-QA DatasetSROCC0.59564TOPIQ FACE
VideoMSU SR-QA DatasetKLCC0.46217TOPIQ
VideoMSU SR-QA DatasetPLCC0.57955TOPIQ
VideoMSU SR-QA DatasetSROCC0.57341TOPIQ
VideoMSU SR-QA DatasetKLCC0.42811TOPIQ trained on PIPAL
VideoMSU SR-QA DatasetPLCC0.57564TOPIQ trained on PIPAL
VideoMSU SR-QA DatasetSROCC0.55568TOPIQ trained on PIPAL
VideoMSU SR-QA DatasetKLCC0.40663TOPIQ (IAA)
VideoMSU SR-QA DatasetPLCC0.51061TOPIQ (IAA)
VideoMSU SR-QA DatasetSROCC0.51687TOPIQ (IAA)
VideoMSU SR-QA DatasetKLCC0.28473TOPIQ + Res50 (IAA)
VideoMSU SR-QA DatasetPLCC0.34TOPIQ + Res50 (IAA)
VideoMSU SR-QA DatasetSROCC0.36204TOPIQ + Res50 (IAA)
VideoMSU SR-QA DatasetKLCC0.26774TOPIQ trained on FLIVE
VideoMSU SR-QA DatasetPLCC0.3394TOPIQ trained on FLIVE
VideoMSU SR-QA DatasetSROCC0.34092TOPIQ trained on FLIVE

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment2025-07-17Text-Visual Semantic Constrained AI-Generated Image Quality Assessment2025-07-144KAgent: Agentic Any Image to 4K Super-Resolution2025-07-09Bridging Video Quality Scoring and Justification via Large Multimodal Models2025-06-26FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment2025-06-25MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment2025-06-25