TRAC: Compilation-Based Design of Transformer Accelerators for FPGAs

2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)(2022)

引用 1|浏览15
暂无评分
摘要
Transformer-type Neural Networks (NNs) have shown impressive accuracy numbers in Natural Language Processing (NLP) applications where Recurrent Neural Networks (RNNs) have been in use before, even surpassing them. However, differing considerably from common types of NNs, existing accelerator designs, particularly for Field-Programmable Gate Arrays (FPGAs), cannot be used to implement them. Previous research has shown FPGAs to be platforms superior to CPUs and even GPUs for accelerating NNs when it comes to energy efficiency. Following the development of automated compiler-based design flows for NNs, there is still a lack of such an approach for transformers and FPGA targets. In this realm, this paper presents a novel compiler called TRAC as well as a library of operators and modules for implementing transformer accelerators on FPGAs. Based on optimization and code generation settings in the compiler using an integrated approach combining weight compression techniques with according adaptations of the accelerator modules, a design space of accelerators is defined and explored. For each design, a system-level data path and control unit architecture is generated, which integrates module-level designs using hierarchical High-Level Synthesis (HLS). We evaluate our implementation for the BERT network and provide results regarding the trade-off between execution time, accuracy, and FPGA resource usage.
更多
查看译文
关键词
n/a
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要