A DNN Inference Latency-aware GPU Power Management Scheme
2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)(2021)
摘要
Graphics Processing Units (GPUs) are widely used for deep learning training as well as inference due to their high processing speed and programmability. Modern GPUs dynamically adjust the clock frequency according to their power management scheme. However, under the default scheme, the clock frequency of a GPU is only determined by utilization rate while being blind to target latency SLO, leading ...
更多查看译文
关键词
GPU energy efficiency,DNN inference,Latency SLO
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要