NeuroCool: Dynamic Thermal Management of 3D DRAM for Deep Neural Networks through Customized Prefetching

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS(2024)

引用 0|浏览4
暂无评分
摘要
Deep neural network (DNN) implementations are typically characterized by huge datasets and concurrent computation, resulting in a demand for high memory bandwidth due to intensive data movement between processors and off-chip memory. Performing DNN inference on general-purpose cores/edge is gaining attraction to enhance user experience and reduce latency. The mismatch in the CPU and conventional DRAM speed leads to under-utilization of the compute capabilities, causing increased inference time. 3D DRAM is a promising solution to effectively fulfill the bandwidth requirement of high-throughput DNNs. However, due to high power density in stacked architectures, 3D DRAMs need dynamic thermal management (DTM), resulting in performance overhead due to memory-induced CPU throttling. We study the thermal impact of DNN applications running on a 3D DRAM system, and make a case for a memory temperature-aware customized prefetch mechanism to reduce DTM overheads and significantly improve performance. In our proposed NeuroCool DTM policy, we intelligently place either DRAM ranks or tiers in low power state, using the DNN layer characteristics and access rate. We establish the generalization of our approach through training and test datasets comprising diverse data points from widely used DNN applications. Experimental results on popular DNNs show that NeuroCool results in a average performance gain of 44% (as high as 52%) and memory energy improvement of 43% (as high as 69%) over general-purpose DTM policies.
更多
查看译文
关键词
3D DRAM,dynamic thermal management,customized prefetching
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要