TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ST-GREED: Space-Time Generalized Entropic Differences for ...

ST-GREED: Space-Time Generalized Entropic Differences for Frame Rate Dependent Video Quality Prediction

Pavan C. Madhusudana, Neil Birkbeck, Yilin Wang, Balu Adsumilli, Alan C. Bovik

2020-10-26Video Quality AssessmentVisual Question Answering (VQA)
PaperPDFCode(official)

Abstract

We consider the problem of conducting frame rate dependent video quality assessment (VQA) on videos of diverse frame rates, including high frame rate (HFR) videos. More generally, we study how perceptual quality is affected by frame rate, and how frame rate and compression combine to affect perceived quality. We devise an objective VQA model called Space-Time GeneRalized Entropic Difference (GREED) which analyzes the statistics of spatial and temporal band-pass video coefficients. A generalized Gaussian distribution (GGD) is used to model band-pass responses, while entropy variations between reference and distorted videos under the GGD model are used to capture video quality variations arising from frame rate changes. The entropic differences are calculated across multiple temporal and spatial subbands, and merged using a learned regressor. We show through extensive experiments that GREED achieves state-of-the-art performance on the LIVE-YT-HFR Database when compared with existing VQA models. The features used in GREED are highly generalizable and obtain competitive performance even on standard, non-HFR VQA databases. The implementation of GREED has been made available online: https://github.com/pavancm/GREED

Results

TaskDatasetMetricValueModel
Video UnderstandingLIVE-YT-HFRSRCC0.8822ST-GREED
Video UnderstandingMSU FR VQA DatabaseKLCC0.5905ST-GREED
Video UnderstandingMSU FR VQA DatabasePLCC0.8116ST-GREED
Video UnderstandingMSU FR VQA DatabaseSRCC0.7547ST-GREED
Video Quality AssessmentLIVE-YT-HFRSRCC0.8822ST-GREED
Video Quality AssessmentMSU FR VQA DatabaseKLCC0.5905ST-GREED
Video Quality AssessmentMSU FR VQA DatabasePLCC0.8116ST-GREED
Video Quality AssessmentMSU FR VQA DatabaseSRCC0.7547ST-GREED
VideoLIVE-YT-HFRSRCC0.8822ST-GREED
VideoMSU FR VQA DatabaseKLCC0.5905ST-GREED
VideoMSU FR VQA DatabasePLCC0.8116ST-GREED
VideoMSU FR VQA DatabaseSRCC0.7547ST-GREED

Related Papers

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09LinguaMark: Do Multimodal Models Speak Fairly? A Benchmark-Based Evaluation2025-07-09Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder2025-06-28Bridging Video Quality Scoring and Justification via Large Multimodal Models2025-06-26DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images2025-06-26