34.1 A 28nm 83.23TFLOPS/W POSIT-Based Compute-in-Memory Macro for High-Accuracy AI Applications

2024 IEEE International Solid-State Circuits Conference (ISSCC)(2024)

引用 0|浏览1
暂无评分
摘要
Rapidly expanding artificial intelligence (Al) models, for complex AI tasks, drive high-energy efficiency and high-precision requirements for Al processors [1–6]. Floating-point CIM (FP-CIM) is a promising technique to improve energy efficiency and maintain accuracy. However, FP-CIM with FP32/FP16/BF16 suffers from a performance bottleneck due to its large storage requirements and its considerable MAC power. The emerging POSIT data format, exploiting dynamic bit width that adapts to varied data distributions, can use a low bit width to achieve nearly the same training and inference accuracy as high bit width FP (POSIT8 $\approx$ FP16 and POSIT16 $\approx$ FP32) [6]. The POSIT data consists of 4 parts: sign (S), regime (R), exponent (E), and mantissa (M). It is defined as POSIT($n, es$), where n is the total bit width and $es$ is the E bit width. The R and M bit width varies dynamically, allowing a data range and precision trade-off at runtime. If R is r bits, then M is m bits, where $m=n-r-e s-1$. R is a unary (thermometer) code with successive Os and 1s: eg. 11110 and 00001. The decimal value of R is the number of successive 1’s count minus constant value 1 for positive values, and it is successive 0’s count for negative values (R=3 for “11110”, R=−4 for “00001”). POSIT computing requires the decoding R to get the POSIT value, which is $(-1)^{\mathrm{S}} \times\left(2^{\mathrm{es}}\right)^{\mathrm{R}} \times 2^{\mathrm{E}} \times 1$. M. With the dynamic data expression to adapt data distribution, POSIT8 can achieve comparable accuracy to BF16 for image classification on VIT-B with a $7.12 \times$ energy reduction.
更多
查看译文
关键词
Decoding,Decimal,Different Signs,Energy Reduction,Inference Accuracy,Artificial Intelligence Models,Working Mode,Least Significant Bit,Most Significant Bit,Bit-width,Bitwise Operations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要