A DNN Inference Latency-aware GPU Power Management Scheme

2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)(2021)

引用 0|浏览0
暂无评分
摘要
Graphics Processing Units (GPUs) are widely used for deep learning training as well as inference due to their high processing speed and programmability. Modern GPUs dynamically adjust the clock frequency according to their power management scheme. However, under the default scheme, the clock frequency of a GPU is only determined by utilization rate while being blind to target latency SLO, leading ...
更多
查看译文
关键词
GPU energy efficiency,DNN inference,Latency SLO
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要