HIPU: A Hybrid Intelligent Processing Unit With Fine-Grained ISA for Real-Time Deep Neural Network Inference Applications

IEEE Transactions on Very Large Scale Integration (VLSI) Systems(2023)

引用 0|浏览10
暂无评分
摘要
Neural network algorithms have shown superior performance over conventional algorithms, leading to the designation and deployment of dedicated accelerators in practical scenarios. Coarse-grained accelerators achieve high performance but can support only a limited number of predesigned operators, which cannot cover the flexible operators emerging in modern neural network algorithms. Therefore, fine-grained accelerators, such as instruction set architecture (ISA)-based accelerators, have become a hot research topic due to their sufficient flexibility to cover the unpredefined operators. The main challenges for fine-grained accelerators include the undesired long delays of single-image inference when performing multibatch inference, as well as the difficulty of meeting real-time constraints when processing multiple tasks simultaneously. This article proposes a hybrid intelligent processing unit (HIPU) to address the aforementioned problems. Specifically, we design a novel conversion-free data format, expanding the single-instruction multiple-data (SIMD) instruction set and optimizing the microarchitecture design to improve the performance. We also arrange the inference schedule to guarantee scalability on multicores. The experimental results show that the proposed accelerator maintains high multiply–accumulation (MAC) utilization for all common operators and achieves high performance with 4– $7\times $ speedup against NVIDIA RTX2080Ti GPU. Finally, the proposed accelerator is manufactured using TSMC 28-nm technology, achieving 1 GHz for each core, with a peak performance of 13 TOPS.
更多
查看译文
关键词
Network-on-chip (NoC),neural network (NN) inference accelerating,out-of-order (OoO) superscalar processor,reduced instruction set architecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要