Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models
arxiv(2023)
摘要
In this paper, we address the problem of enhancing perceptual quality in
video super-resolution (VSR) using Diffusion Models (DMs) while ensuring
temporal consistency among frames. We present StableVSR, a VSR method based on
DMs that can significantly enhance the perceptual quality of upscaled videos by
synthesizing realistic and temporally-consistent details. We introduce the
Temporal Conditioning Module (TCM) into a pre-trained DM for single image
super-resolution to turn it into a VSR method. TCM uses the novel Temporal
Texture Guidance, which provides it with spatially-aligned and detail-rich
texture information synthesized in adjacent frames. This guides the generative
process of the current frame toward high-quality and temporally-consistent
results. In addition, we introduce the novel Frame-wise Bidirectional Sampling
strategy to encourage the use of information from past to future and
vice-versa. This strategy improves the perceptual quality of the results and
the temporal consistency across frames. We demonstrate the effectiveness of
StableVSR in enhancing the perceptual quality of upscaled videos while
achieving better temporal consistency compared to existing state-of-the-art
methods for VSR. The project page is available at
https://github.com/claudiom4sir/StableVSR.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要