TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ChipQA: No-Reference Video Quality Prediction via Space-Ti...

ChipQA: No-Reference Video Quality Prediction via Space-Time Chips

Joshua P. Ebenezer, Zaixi Shang, Yongjun Wu, Hai Wei, Sriram Sethuraman, Alan C. Bovik

2021-09-17Video Quality AssessmentVisual Question Answering (VQA)
PaperPDFCode(official)

Abstract

We propose a new model for no-reference video quality assessment (VQA). Our approach uses a new idea of highly-localized space-time (ST) slices called Space-Time Chips (ST Chips). ST Chips are localized cuts of video data along directions that \textit{implicitly} capture motion. We use perceptually-motivated bandpass and normalization models to first process the video data, and then select oriented ST Chips based on how closely they fit parametric models of natural video statistics. We show that the parameters that describe these statistics can be used to reliably predict the quality of videos, without the need for a reference video. The proposed method implicitly models ST video naturalness, and deviations from naturalness. We train and test our model on several large VQA databases, and show that our model achieves state-of-the-art performance at reduced cost, without requiring motion computation.

Results

TaskDatasetMetricValueModel
Video UnderstandingLIVE-VQCPLCC0.7299ChipQA
Video UnderstandingYouTube-UGCPLCC0.6911ChipQA
Video UnderstandingLIVE-ETRISRCC0.6323ChipQA
Video UnderstandingLIVE LivestreamSRCC0.7575ChipQA
Video UnderstandingKoNViD-1kPLCC0.7625ChipQA
Video Quality AssessmentLIVE-VQCPLCC0.7299ChipQA
Video Quality AssessmentYouTube-UGCPLCC0.6911ChipQA
Video Quality AssessmentLIVE-ETRISRCC0.6323ChipQA
Video Quality AssessmentLIVE LivestreamSRCC0.7575ChipQA
Video Quality AssessmentKoNViD-1kPLCC0.7625ChipQA
VideoLIVE-VQCPLCC0.7299ChipQA
VideoYouTube-UGCPLCC0.6911ChipQA
VideoLIVE-ETRISRCC0.6323ChipQA
VideoLIVE LivestreamSRCC0.7575ChipQA
VideoKoNViD-1kPLCC0.7625ChipQA

Related Papers

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09LinguaMark: Do Multimodal Models Speak Fairly? A Benchmark-Based Evaluation2025-07-09Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder2025-06-28Bridging Video Quality Scoring and Justification via Large Multimodal Models2025-06-26DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images2025-06-26