Sparsity Exploration for Structured and Unstructured Weight Formations in CNN Architecture

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览0
暂无评分
摘要
The Convolutional Neural Network (CNN) is a frequently used algorithm used technique for processing pixel data and visual images. Nevertheless, the utilization of extensive network sizes poses significant obstacles to the underlying technology's throughput and energy efficiency. Various methods have been devised to enhance the efficiency and computing speed of Convolutional Neural Network (CNN) acceleration, improving the accuracy of the CNN process. Compression techniques have been employed in the algorithmic aspect to minimize the number of parameters in each layer, enhancing the detection process's efficiency. Pruning is a compression technique that involves the removal of neurons and synapses that do not play a substantial role in the detection process, hence minimizing their impact on the Neural Network layer. The compression method generates sparse data, wherein a significant amount of data has a value of zero. Various pruning techniques provide distinct types of sparse data, encompassing organized structured sparse and unstructured sparse data. The consequent form of sparse data structure has many advantages and downsides. Based on an analysis of various extant sparse data structures, an innovative accelerator has been developed to perform convolution operations using various types of sparse data obtained via distinct pruning procedures. The design architecture outlined in this study involves the removal of all zero-valued data weights through a pruning technique following the training process. The accelerator will process all non-zero weights by performing the convolution process promptly, utilizing the "weight stream" data. The findings from the conducted design testing indicate that the utilization of trimmed sparse data leads to improved performance in terms of faster output during the convolution process. The researchers conducted experiments utilizing diverse types of sparse weighted data, encompassing both structured and unstructured forms. These experiments provided evidence that the hardware accelerator is capable of performing convolutional computing operations with different variations of sparse data. The convolution procedure utilizing the developed accelerator is projected to yield a reduction of 56% in the computation results of all layers in YoloV3-Tiny. In the context of data calculations, including a sparsity percentage of up to 30%, it can be observed that the convolution process exhibits a reduction of 25% in relation to the overall convolution process.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要