A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference
CoRR(2024)
摘要
RISC-V processors encounter substantial challenges in deploying
multi-precision deep neural networks (DNNs) due to their restricted precision
support, constrained throughput, and suboptimal dataflow design. To tackle
these challenges, a scalable RISC-V vector (RVV) processor, namely SPEED, is
proposed to enable efficient multi-precision DNN inference by innovations from
customized instructions, hardware architecture, and dataflow mapping. Firstly,
dedicated customized RISC-V instructions are proposed based on RVV extensions,
providing SPEED with fine-grained control over processing precision ranging
from 4 to 16 bits. Secondly, a parameterized multi-precision systolic array
unit is incorporated within the scalable module to enhance parallel processing
capability and data reuse opportunities. Finally, a mixed multi-precision
dataflow strategy, compatible with different convolution kernels and data
precision, is proposed to effectively improve data utilization and
computational efficiency. We perform synthesis of SPEED in TSMC 28nm
technology. The experimental results demonstrate that SPEED achieves a peak
throughput of 287.41 GOPS and an energy efficiency of 1335.79 GOPS/W at 4-bit
precision condition, respectively. Moreover, when compared to the pioneer
open-source vector processor Ara, SPEED provides an area efficiency improvement
of 2.04× and 1.63× under 16-bit and 8-bit precision conditions,
respectively, which shows SPEED's significant potential for efficient
multi-precision DNN inference.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要