TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/UGC-VQA: Benchmarking Blind Video Quality Assessment for U...

UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

2020-05-29Benchmarkingfeature selectionVideo Quality AssessmentVisual Question Answering (VQA)
PaperPDFCodeCodeCodeCodeCode(official)

Abstract

Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to the evolution of affordable and reliable consumer capture devices, and the tremendous popularity of social media platforms. Accordingly, there is a great need for accurate video quality assessment (VQA) models for UGC/consumer videos to monitor, control, and optimize this vast content. Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of UGC content are unpredictable, complicated, and often commingled. Here we contribute to advancing the UGC-VQA problem by conducting a comprehensive evaluation of leading no-reference/blind VQA (BVQA) features and models on a fixed evaluation architecture, yielding new empirical insights on both subjective video quality studies and VQA model design. By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models to create a new fusion-based BVQA model, which we dub the \textbf{VID}eo quality \textbf{EVAL}uator (VIDEVAL), that effectively balances the trade-off between VQA performance and efficiency. Our experimental results show that VIDEVAL achieves state-of-the-art performance at considerably lower computational cost than other leading models. Our study protocol also defines a reliable benchmark for the UGC-VQA problem, which we believe will facilitate further research on deep learning-based VQA modeling, as well as perceptually-optimized efficient UGC video processing, transcoding, and streaming. To promote reproducible research and public evaluation, an implementation of VIDEVAL has been made available online: \url{https://github.com/tu184044109/VIDEVAL_release}.

Results

TaskDatasetMetricValueModel
Video UnderstandingMSU NR VQA DatabaseKLCC0.5414VIDEVAL
Video UnderstandingMSU NR VQA DatabasePLCC0.7717VIDEVAL
Video UnderstandingMSU NR VQA DatabaseSRCC0.7286VIDEVAL
Video UnderstandingLIVE-VQCPLCC0.7514VIDEVAL
Video UnderstandingYouTube-UGCPLCC0.7733VIDEVAL
Video UnderstandingKoNViD-1kPLCC0.7803VIDEVAL
Video UnderstandingLIVE-FB LSVQPLCC0.783VIDEVAL
Video Quality AssessmentMSU NR VQA DatabaseKLCC0.5414VIDEVAL
Video Quality AssessmentMSU NR VQA DatabasePLCC0.7717VIDEVAL
Video Quality AssessmentMSU NR VQA DatabaseSRCC0.7286VIDEVAL
Video Quality AssessmentLIVE-VQCPLCC0.7514VIDEVAL
Video Quality AssessmentYouTube-UGCPLCC0.7733VIDEVAL
Video Quality AssessmentKoNViD-1kPLCC0.7803VIDEVAL
Video Quality AssessmentLIVE-FB LSVQPLCC0.783VIDEVAL
VideoMSU NR VQA DatabaseKLCC0.5414VIDEVAL
VideoMSU NR VQA DatabasePLCC0.7717VIDEVAL
VideoMSU NR VQA DatabaseSRCC0.7286VIDEVAL
VideoLIVE-VQCPLCC0.7514VIDEVAL
VideoYouTube-UGCPLCC0.7733VIDEVAL
VideoKoNViD-1kPLCC0.7803VIDEVAL
VideoLIVE-FB LSVQPLCC0.783VIDEVAL

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17mNARX+: A surrogate model for complex dynamical systems using manifold-NARX and automatic feature selection2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16