Exploring and Exploiting High-Order Spatial-Temporal Dynamics for Long-Term Frame Prediction.

IEEE Trans. Circuits Syst. Video Technol.(2024)

引用 0|浏览1
Long-term spatial-temporal frame prediction focuses on predicting future image frames precisely, which has numerous applications in real-world scenarios. Existing deep learning prediction models mainly rely on advanced neural network architectures to model complicated spatial-temporal features, which make few efforts to explore high-order correlations to better capture long-term dynamics. Their prediction on long-term frames suffers from inaccurate visual and motion detail issue. In this article, we propose a high-order prediction model for long-term frame prediction, which improves the appearance and motion details by designing special high-order correlation modules in two aspects. First, to enhance the appearance details of predicted frames, we propose a high-order appearance encoder module, where high-order appearance features can be effectively captured with a carefully designed Non-local ConvLSTM. Second, to guarantee the motion accuracy of predicted sequences, we carefully design a high-order motion encoder module, which can accurately capture and preserve the high-order motion patterns with adaptive motion extractors and progressive memory banks, respectively. Comprehensive experiments are conducted on six challenging datasets from real-world scenarios, which demonstrate the effectiveness and superiority of our proposed method over state-of-the-art methods.
Long-Term,Spatial-Temporal Frame Prediction,High-Order Appearance Features,High-Order Motion Patterns
AI 理解论文
Chat Paper