KunlunTVM: A Compilation Framework for Kunlun Chip Supporting Both Training and Inference

ACM Great Lakes Symposium on VLSI (GLSVLSI)(2022)

引用 0|浏览7
暂无评分
摘要
With the rapid development of deep learning, training big neural network models demands huge amount of computing power.Therefore, many accelerators are designed to meet the performance requirements. Recently, series of Kunlun chips have been released, which claim comparable performance over GPUs. However, there lacks an end-to-end compiler to support both training and inference on Kunlun chip,leaving large performance optimization space to be explored. This paper presents KunlunTVM, the first end-to-end compiler based on TVM, supporting both training and inference tasks on Kunlun Chip. Experimental results show that KunlunTVM achieves up to 5x training performance improvement over the existing framework PaddlePaddle supporting Kunlun chip. It is noteworthy that the proposed methods are general and extensible for the TVM framework targeting different backends.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要