An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory

J. Grid Comput.(2023)

引用 1|浏览22
暂无评分
摘要
Unified virtual memory (UVM) improves GPU programmability by enabling on-demand data movement between CPU memory and GPU memory. However, due to the limited capacity of GPU device memory, oversubscription overhead becomes a major performance bottleneck for data-intensive workloads running on GPUs with UVM. This paper proposes a novel framework for UVM oversubscription management in discrete CPU-GPU systems. It consists of an access pattern classifier followed by a pattern-specific transformer-based model using a novel loss function aiming to reduce page thrashing. A policy engine is designed to leverage the model’s result to perform accurate page prefetching and eviction. Our evaluation shows that our proposed framework significantly outperforms the state-of-the-art (SOTA) methods on a set of 11 memory-intensive benchmarks, reducing the number of pages thrashed by 64.4% under 125% memory oversubscription compared to the baseline, while the SOTA method reduces the number of pages thrashed by 17.3%. Compared to the SOTA method, our solution achieves average IPC improvement of 1.52X and 3.66X under 125% and 150% memory oversubscription.
更多
查看译文
关键词
Discrete CPU-GPU system,Unified virtual memory,Oversubscription,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要