TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/No-Reference Video Quality Assessment Using Space-Time Chips

No-Reference Video Quality Assessment Using Space-Time Chips

2020-08-23Video Quality AssessmentVisual Question Answering (VQA)
PaperPDFCode(official)

Abstract

We propose a new prototype model for no-reference video quality assessment (VQA) based on the natural statistics of space-time chips of videos. Space-time chips (ST-chips) are a new, quality-aware feature space which we define as space-time localized cuts of video data in directions that are determined by the local motion flow. We use parametrized distribution fits to the bandpass histograms of space-time chips to characterize quality, and show that the parameters from these models are affected by distortion and can hence be used to objectively predict the quality of videos. Our prototype method, which we call ChipQA-0, is agnostic to the types of distortion affecting the video, and is based on identifying and quantifying deviations from the expected statistics of natural, undistorted ST-chips in order to predict video quality. We train and test our resulting model on several large VQA databases and show that our model achieves high correlation against human judgments of video quality and is competitive with state-of-the-art models.

Results

TaskDatasetMetricValueModel
Video UnderstandingLIVE-ETRISRCC0.4028ChipQA-0
Video UnderstandingLIVE LivestreamSRCC0.7513ChipQA-0
Video Quality AssessmentLIVE-ETRISRCC0.4028ChipQA-0
Video Quality AssessmentLIVE LivestreamSRCC0.7513ChipQA-0
VideoLIVE-ETRISRCC0.4028ChipQA-0
VideoLIVE LivestreamSRCC0.7513ChipQA-0

Related Papers

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09LinguaMark: Do Multimodal Models Speak Fairly? A Benchmark-Based Evaluation2025-07-09Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder2025-06-28Bridging Video Quality Scoring and Justification via Large Multimodal Models2025-06-26DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images2025-06-26