Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

MohammadHossein AskariHemmat,Theo Dupuis,Yoan Fournier,Nizar El Zarif,Matheus Cavalcante,Matteo Perotti,Frank Gurkaynak,Luca Benini,Francois Leduc-Primeau,Yvon Savaria,Jean-Pierre David

arxiv（2023）

引用 2|浏览13

暂无评分

摘要

In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes.

查看译文

关键词

RISC-V,Vector ISA,Quantization,Machine Learning,Efficiency

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要