Chrome Extension
WeChat Mini Program
Use on ChatGLM

Most Significant One-Driven Shifting Dynamic Efficient Multipliers for Large Language Models.

Bin Gong,Ke Chen, Pengfei Huang,Bi Wu,Weiqiang Liu

IEEE International Symposium on Circuits and Systems(2024)

Cited 0|Views5
No score
Abstract
Large Language Models (LLMs) have demonstrated exceptional performance but demand significantly more computational power and memory compared to Deep Neural Networks (DNNs). This necessitates the development of more energy-efficient hardware designs. This paper introduces a novel weight approximation strategy for quantized LLMs, resulting in the creation of a highly efficient approximate multiplier based on Most Significant One (MSO) shifting. When compared to energy-efficient approximate logarithmic multipliers and precision-demanding approximate non-logarithmic multipliers, the proposed design strikes an optimal balance between accuracy and hardware cost. It maintains a superior level of accuracy while incurring hardware costs comparable to logarithmic multipliers and, in some cases, even outperforming them. In particular, when compared to exact multiplier, the proposed design achieves significant reductions, including up to a 28.31% reduction in area, a 57.84% decrease in power consumption, and an 11.86% reduction in delay. The experiments demonstrate that the proposed multiplier in DNNs can save approximately 60% of energy without compromising task accuracy. Similarly, experiments on the Transformer accelerator indicate substantial energy savings for LLMs using the proposed design.
More
Translated text
Key words
approximate computing,large language model,constant coefficient multiplier,logarithmic multiplier
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined