ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Xinyi Wang, Angeliki Katsenou, David Bull

2024-07-16Video Compression Optical Flow Estimation Video Quality Assessment Visual Question Answering (VQA)

Abstract

With the rapid growth of User-Generated Content (UGC) exchanged between users and sharing platforms, the need for video quality assessment in the wild is increasingly evident. UGC is typically acquired using consumer devices and undergoes multiple rounds of compression (transcoding) before reaching the end user. Therefore, traditional quality metrics that employ the original content as a reference are not suitable. In this paper, we propose ReLaX-VQA, a novel No-Reference Video Quality Assessment (NR-VQA) model that aims to address the challenges of evaluating the quality of diverse video content without reference to the original uncompressed videos. ReLaX-VQA uses frame differences to select spatio-temporal fragments intelligently together with different expressions of spatial features associated with the sampled frames. These are then used to better capture spatial and temporal variabilities in the quality of neighbouring frames. Furthermore, the model enhances abstraction by employing layer-stacking techniques in deep neural network features from Residual Networks and Vision Transformers. Extensive testing across four UGC datasets demonstrates that ReLaX-VQA consistently outperforms existing NR-VQA methods, achieving an average SRCC of 0.8658 and PLCC of 0.8873. Open-source code and trained models that will facilitate further research and applications of NR-VQA can be found at https://github.com/xinyiW915/ReLaX-VQA.

Results

Task	Dataset	Metric	Value	Model
Video Understanding	LIVE-VQC	PLCC	0.8876	ReLaX-VQA (finetuned on LIVE-VQC)
Video Understanding	LIVE-VQC	PLCC	0.8242	ReLaX-VQA (trained on LSVQ only)
Video Understanding	LIVE-VQC	PLCC	0.8079	ReLaX-VQA
Video Understanding	YouTube-UGC	PLCC	0.8652	ReLaX-VQA (finetuned on YouTube-UGC)
Video Understanding	YouTube-UGC	PLCC	0.8354	ReLaX-VQA (trained on LSVQ only)
Video Understanding	YouTube-UGC	PLCC	0.8204	ReLaX-VQA
Video Understanding	KoNViD-1k	PLCC	0.8668	ReLaX-VQA (finetuned on KoNViD-1k)
Video Understanding	KoNViD-1k	PLCC	0.8473	ReLaX-VQA
Video Understanding	KoNViD-1k	PLCC	0.8427	ReLaX-VQA (trained on LSVQ only)
Video Quality Assessment	LIVE-VQC	PLCC	0.8876	ReLaX-VQA (finetuned on LIVE-VQC)
Video Quality Assessment	LIVE-VQC	PLCC	0.8242	ReLaX-VQA (trained on LSVQ only)
Video Quality Assessment	LIVE-VQC	PLCC	0.8079	ReLaX-VQA
Video Quality Assessment	YouTube-UGC	PLCC	0.8652	ReLaX-VQA (finetuned on YouTube-UGC)
Video Quality Assessment	YouTube-UGC	PLCC	0.8354	ReLaX-VQA (trained on LSVQ only)
Video Quality Assessment	YouTube-UGC	PLCC	0.8204	ReLaX-VQA
Video Quality Assessment	KoNViD-1k	PLCC	0.8668	ReLaX-VQA (finetuned on KoNViD-1k)
Video Quality Assessment	KoNViD-1k	PLCC	0.8473	ReLaX-VQA
Video Quality Assessment	KoNViD-1k	PLCC	0.8427	ReLaX-VQA (trained on LSVQ only)
Video	LIVE-VQC	PLCC	0.8876	ReLaX-VQA (finetuned on LIVE-VQC)
Video	LIVE-VQC	PLCC	0.8242	ReLaX-VQA (trained on LSVQ only)
Video	LIVE-VQC	PLCC	0.8079	ReLaX-VQA
Video	YouTube-UGC	PLCC	0.8652	ReLaX-VQA (finetuned on YouTube-UGC)
Video	YouTube-UGC	PLCC	0.8354	ReLaX-VQA (trained on LSVQ only)
Video	YouTube-UGC	PLCC	0.8204	ReLaX-VQA
Video	KoNViD-1k	PLCC	0.8668	ReLaX-VQA (finetuned on KoNViD-1k)
Video	KoNViD-1k	PLCC	0.8473	ReLaX-VQA
Video	KoNViD-1k	PLCC	0.8427	ReLaX-VQA (trained on LSVQ only)

ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Abstract

Results

Related Papers

ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Abstract

Results

Related Papers