XMG-GPPIC: Efficient and Robust General-Purpose Processing-in-Cache with XOR-Majority-Graph

Chen Nie, Xianjue Cai,Chenyang Lv, Chen Huang,Weikang Qian,Zhezhi He

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023(2023)

引用 0|浏览24
暂无评分
摘要
Recent advances in processing-in-cache (PIC) have enabled generalpurpose, high-performance computation with bit-serial computing techniques. Its outstanding performance relies on efficient hardware design, and also the software stack (i.e., Logic Compiler, LC) that converts a high-level function into compact PIC instructions to be executed. Since XOR-Majority-Graph (XMG) is one of the most efficient forms to represent a Boolean function, designing the PIC with XMG can further improve the performance. Thus, we propose an efficient and robust General-Purpose PIC using XMG, aka. XMG-GPPIC, with designs in both hardware and software. For the hardware part, we propose a micro-architecture of XMG-GPPIC supporting XMG operation. To improve computing efficiency and robustness against non-ideal effects, we highlight our novel designs of inversion fusion and temperature compensation. For the software part, we develop the XMG-LC for optimized compilation for GPPIC, which includes two main steps of synthesis and scheduling. In the synthesis, we propose a multi-line reinforcement learning agent to search the optimal synthesis flow for the best end-to-end GPPIC performance. In the scheduling, we minimize the memory footprint occupied by the computation and support inversion fusion for instruction reduction. Our design reduces the number of operations by 67.7% on average w.r.t a majority-inverter-graph-based prior work, and the average end-to-end energy-delay product is 50.2% and 13.1% lower than our XMG-based naïve and heuristically optimized baselines, respectively. At the system level, our design outperforms compute cache with and-inverter-graph-based LC by 77% and 64% in terms of throughput and efficiency, respectively.
更多
查看译文
关键词
In-Memory Computing, bit-serial, logic synthesis, SRAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要