Agile Autotuning of a Transprecision Tensor Accelerator Overlay for TVM Compiler Stack

Diamantopoulos Dionysios,Ringlein Burkhard,Purandare Mitra,Singh Gagandeep,Hagleitner Christoph

2020 30th International Conference on Field-Programmable Logic and Applications (FPL)（2020）

引用 6|浏览15

暂无评分

摘要

Specialized accelerators for tensor-operations, such as blocked-matrix operations and multi-dimensional convolutions, have emerged as powerful architecture choices for high-performance Deep-Learning computing. The rapid development of frameworks, models, and precision options challenges the adaptability of such tensor-accelerators since the adaptation to new requirements incurs significant engineering costs. Programmable tensor accelerators offer a promising alternative by allowing reconfiguration of a virtual architecture that overlays on top of the physical FPGA configurable fabric. We propose an overlay (τ-VTA) and an optimization method guided by agile-inspired auto-tuning techniques. We achieve higher performance of up to 2.5x and faster convergence of up to 8.1x.

查看译文

关键词

Neural Networks, Machine Learning, Autotuning, FPGA, Transprecision Computing, Tensor Accelerator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要