TransLib: A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes

2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE(2023)

引用 0|浏览30
暂无评分
摘要
Reduced-precision floating-point (FP) arithmetic is being widely adopted to reduce memory footprint and execution time on battery-powered Internet of Things (IoT) end-nodes. However, reduced precision computations must meet end-do-end precision constraints to be acceptable at the application level. This work introduces TransLib(1), an open-source kernel library based on transprecision computing principles, which provides knobs to exploit different FP data types (i.e., float, float16, and bfloat16), also considering the trade-off between homogeneous and mixed-precision solutions. We demonstrate the capabilities of the proposed library on PULP, a 32-bit microcontroller (MCU) coupled with a parallel, programmable accelerator. On average, TransLib kernels achieve an IPC of 0.94 and a speed-up of 1.64x using 16-bit vectorization. The parallel variants achieve a speed-up of 1.97x, 3.91x, and 7.59x on 2, 4, and 8 cores, respectively. The memory footprint reduction is between 25% and 50%. Finally, we show that mixed-precision variants increase the accuracy by 30x at the cost of 2.09x execution time and 1.35x memory footprint compared to float16 vectorized.
更多
查看译文
关键词
transprecision computing,IoT end-nodes,parallel programming,SIMD vectorization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要