谷歌Chrome浏览器插件
订阅小程序
在清言上使用

Design Space Description Language for Automated and Comprehensive Exploration of Next-Gen Hardware Accelerators

semanticscholar(2022)

引用 0|浏览2
暂无评分
摘要
Exploration of accelerators typically involves an architectural template specified in architecture description language (ADL). It can limit the design space that can be explored, reusability and automation of system stack, explainability, and exploration efficiency. We envision Design Space Description Language (DSDL) for comprehensive, reusable, explainable, and agile DSE. We describe how its flow graph abstraction enables comprehensive DSE of modular designs, with architectural components organized in various hierarchies and groups. We discuss automation of characterizing, simulating, and programming new architectures. Lastly, we describe how DSDL flow graphs facilitate bottleneck analysis, yielding explainability of costs and selected designs and super-fast exploration. 1 NEED FOR DESIGN SPACE DESCRIPTION Design space exploration (DSE) of accelerators, especially for machine learning [5, 10], require efficient HW/SW codesigns that meet strict execution constraints [20, 21, 23]. The need for a single accelerator for multiple workloads necessitates bottom-up exploration. ADL-based design approach: Recent frameworks explore designs of a certain architecture (e.g., systolic arrays, PEs sharing unified buffer that is filled by DMAs) [12, 19, 33, 35]. They describe architectural template in the ADL [4, 15, 17]. So, design process focuses on specific architectural organization (i.e., specific types of computational and memory units interconnected in certain ways and hierarchy), and hardware design space is limited to values of architecture’s hyperparameters [16, 37]. Execution costs are provided by either expert-maneuvered analytical models for the architecture [7, 34] or synthesizing each design (which is time-consuming). Space of algorithm-to-accelerator mappings is also formulated based on the template [7, 19]. Thus, DSE frameworks lack following capabilities: • Exploring efficient solutions from broad design space: Since design space gets restricted to the template architecture (e.g., consider one-level, shared buffer as a memory), a vast space of architectures is left unexplored (multi-level buffers, unified buffers, DMA ports instead of buffers), even if some can be more effective. • Reusability of design flow for novel, wide range of architectures: Since design tools are developed for a single template, they can be incompatible with architectures from a broad space, which impacts their reusability. Because, when design space is broadened, such as by integrating new functionality or novel implementation of Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). LATTE ’22, March 1, 2022, Virtual, Earth © 2022 Copyright held by the owner/author(s). off-chip memory controller input neuron buffer output neuron buffer PE2 PE3 PE4 PE1 PE.FU. Adder_ tree PE.FU. Multiplier _array PE. synapse_ buffer
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要