Acceleration of Neural Network Training on Hardware via HLS for an Edge-AI Device

2020 Innovations in Intelligent Systems and Applications Conference (ASYU)（2020）

引用 2|浏览3

暂无评分

摘要

The neural network has two computational steps: inference and training. In recent years, high-performance inference processing is accelerated by AI-specific hardware accelerators. However, training phase, which is the most important part of the computational load of a neural network, remains a difficult problem to solve in low-power edge devices. From a design productivity perspective, designing efficient hardware accelerator for training phase at RTL level is a time-consuming process due to cumbersome of the low-level hardware description languages. Therefore, we propose a high-level description (C) of the training algorithm: gradient descent algorithm using high-level synthesis-Vivado HLS, to generate a computationally efficient accelerator. Empirical results show that the resource constrained edge devices can also perform the training part while they are also inferring. In this study, we propose a system-on-chip solution to accelerate neural network training workloads for an edge-AI device.

查看译文

关键词

High-level synthesis,system-on-chip (SoC),FPGA,Edge-AI device

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要