Weixia Zhang, Kede Ma, Jia Yan, Dexiang Deng, Zhou Wang
We propose a deep bilinear model for blind image quality assessment (BIQA) that handles both synthetic and authentic distortions. Our model consists of two convolutional neural networks (CNN), each of which specializes in one distortion scenario. For synthetic distortions, we pre-train a CNN to classify image distortion type and level, where we enjoy large-scale training data. For authentic distortions, we adopt a pre-trained CNN for image classification. The features from the two CNNs are pooled bilinearly into a unified representation for final quality prediction. We then fine-tune the entire model on target subject-rated databases using a variant of stochastic gradient descent. Extensive experiments demonstrate that the proposed model achieves superior performance on both synthetic and authentic databases. Furthermore, we verify the generalizability of our method on the Waterloo Exploration Database using the group maximum differentiation competition.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Video Understanding | MSU NR VQA Database | KLCC | 0.775 | DBCNN |
| Video Understanding | MSU NR VQA Database | PLCC | 0.9222 | DBCNN |
| Video Understanding | MSU NR VQA Database | SRCC | 0.922 | DBCNN |
| Video Understanding | MSU SR-QA Dataset | KLCC | 0.55139 | DBCNN |
| Video Understanding | MSU SR-QA Dataset | PLCC | 0.63971 | DBCNN |
| Video Understanding | MSU SR-QA Dataset | SROCC | 0.68621 | DBCNN |
| Video Quality Assessment | MSU NR VQA Database | KLCC | 0.775 | DBCNN |
| Video Quality Assessment | MSU NR VQA Database | PLCC | 0.9222 | DBCNN |
| Video Quality Assessment | MSU NR VQA Database | SRCC | 0.922 | DBCNN |
| Video Quality Assessment | MSU SR-QA Dataset | KLCC | 0.55139 | DBCNN |
| Video Quality Assessment | MSU SR-QA Dataset | PLCC | 0.63971 | DBCNN |
| Video Quality Assessment | MSU SR-QA Dataset | SROCC | 0.68621 | DBCNN |
| Image Quality Assessment | KADID-10k | PLCC | 0.856 | DB-CNN |
| Image Quality Assessment | KADID-10k | SRCC | 0.851 | DB-CNN |
| Image Quality Assessment | TID2013 | PLCC | 0.865 | DB-CNN |
| Image Quality Assessment | TID2013 | SRCC | 0.816 | DB-CNN |
| Image Quality Assessment | CSIQ | PLCC | 0.959 | DB-CNN |
| Image Quality Assessment | CSIQ | SRCC | 0.946 | DB-CNN |
| Video | MSU NR VQA Database | KLCC | 0.775 | DBCNN |
| Video | MSU NR VQA Database | PLCC | 0.9222 | DBCNN |
| Video | MSU NR VQA Database | SRCC | 0.922 | DBCNN |
| Video | MSU SR-QA Dataset | KLCC | 0.55139 | DBCNN |
| Video | MSU SR-QA Dataset | PLCC | 0.63971 | DBCNN |
| Video | MSU SR-QA Dataset | SROCC | 0.68621 | DBCNN |
| No-Reference Image Quality Assessment | KADID-10k | PLCC | 0.856 | DB-CNN |
| No-Reference Image Quality Assessment | KADID-10k | SRCC | 0.851 | DB-CNN |
| No-Reference Image Quality Assessment | TID2013 | PLCC | 0.865 | DB-CNN |
| No-Reference Image Quality Assessment | TID2013 | SRCC | 0.816 | DB-CNN |
| No-Reference Image Quality Assessment | CSIQ | PLCC | 0.959 | DB-CNN |
| No-Reference Image Quality Assessment | CSIQ | SRCC | 0.946 | DB-CNN |