Chrome Extension
WeChat Mini Program
Use on ChatGLM

Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accelerator

2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024(2024)

Cited 0|Views135
No score
Abstract
The wide adoption and significant computing resource cost of attention-based transformers, e.g., Vision Transformers and large language models, have driven the demand for efficient hardware accelerators. While electronic accelerators have been commonly used, there is a growing interest in exploring photonics as an alternative technology due to its high energy efficiency and ultra-fast processing speed. Photonic accelerators have demonstrated promising results for convolutional neural networks (CNNs) workloads, which predominantly rely on weight-static linear operations. However, they encounter challenges when it comes to efficiently supporting attention-based Transformer architectures, raising questions about the applicability of photonics to advanced machine-learning tasks. The primary hurdle lies in their inefficiency in handling the unique workloads inherent to Transformers, i.e., dynamic and full-range tensor multiplication. In this work, we propose Lightening-Transformer, the first light-empowered, high-performance, and energy-efficient photonic Transformer accelerator. To overcome the fundamental limitation of existing photonic tensor core designs, we introduce a novel dynamically-operated photonic tensor core, DPTC, consisting of a crossbar array of interference-based optical vector dot-product engines, supporting highly parallel, dynamic, and full-range matrix multiplication. Furthermore, we design a dedicated accelerator that integrates our novel photonic computing cores with photonic interconnects for inter-core data broadcast, fully unleashing the power of optics. The comprehensive evaluation demonstrates that Lightening-Transformer achieves >2.6x energy and >12x latency reductions compared to prior photonic accelerators and delivers the lowest energy cost and 2 to 3 orders of magnitude lower energy-delay product compared to the electronic Transformer accelerator, all while maintaining digital-comparable accuracy. Our work highlights the immense potential of photonics for efficient hardware accelerators, particularly for advanced machine-learning workloads, such as Transformer-backboned large language models (LLM). Our implementation is available at https://github.com/zhuhanqing/Lightening-Transformer.
More
Translated text
Key words
photonic tensor core,transformer accelerator,dynamically-operated,energy-efficient
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined