SDST-Accelerating GEMM-based Convolution through Smart Data Stream Transformation

2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)(2022)

引用 2|浏览6
暂无评分
摘要
The development of flexible Convolutional Neural Network (CNN) accelerators is critical for large-scale inference and training. Accelerators based on the General Matrix Multiplication (GEMM) kernel have gained popularity due to their ability to accelerate the most prevalent convolutional and fully connected layers in CNNs. However, the convolution inputs must be reshaped and packed into redundant matrices, which is performed by the im2col (image to column) algorithm. As the performance of the GEMM kernel improves, it increases latency and gradually becomes a bottleneck. To address this issue, we propose Smart Data Stream Transformation (SDST), a technique that eliminates explicit data transformation through data stream manipulation. SDST divides the input data into conflict-free streams based on the locality of data redundancy. Additionally, we design the continuity-friendly data layout to unify the transformations across data streams. Our design is evaluated by running the YoloV3-tiny model on an FPGA-based prototype system. Experimental results show that SDST improves the performance of convolutional acceleration by a factor of 1.12 to 5.69 compared to explicit im2col performed on the CPU.
更多
查看译文
关键词
Convolutional Neural Network,Accelerator,Data Stream
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要