Stereoscopic video quality measurement with fine-tuning 3D ResNets

MULTIMEDIA TOOLS AND APPLICATIONS(2022)

引用 0|浏览12
暂无评分
摘要
Recently, Convolutional Neural Networks with 3D kernels (3D CNNs) have shown great superiority over 2D CNNs for video processing applications. In the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are utilized to extract the spatio-temporal features from the stereoscopic video. Besides, the emergence of substantial video datasets such as Kinetics has made it possible to use pre-trained 3D CNNs in other video-related fields. In this paper, we fine-tune 3D Residual Networks (3D ResNets) pre-trained on the Kinetics dataset for measuring the quality of stereoscopic videos and propose a no-reference SVQA method. Specifically, our aim is twofold: Firstly, we answer the question: can we use 3D CNNs as a quality-aware feature extractor from stereoscopic videos or not. Secondly, we explore which ResNet architecture is more appropriate for SVQA. Experimental results on two publicly available SVQA datasets of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 show the effectiveness of the proposed transfer learning-based method for SVQA that provides the RMSE of 0.332 in LFOVIAS3DPh2 dataset. Also, the results show that deeper 3D ResNet models extract more efficient quality-aware features.
更多
查看译文
关键词
3D convolutional neural networks, Fine-tuning, Objective quality assessment, Pre-training, Stereoscopic video, Transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要