ARM/NEON Co-design of Multiplication/Squaring.

WISA(2017)

引用 1|浏览12
暂无评分
摘要
Many modern mobile processors support new SIMD extensions (e.g. NEON engine) and previous applications (e.g. image processing, cryptography) written in SISD are accelerated by re-writing the previous implementations in SIMD instruction sets. Particularly, integer multiplication and squaring operations are the most expensive in Public Key Cryptography (PKC). Many works have been conducted to reduce the execution timing in NEON instruction set. However, ARM–NEON processor also supports powerful ARM instruction set as well. By exploiting the ARM instruction together with NEON engine, we can achieve further improved performance. After this observation, we introduce new parallel approach for integer multiplication and squaring operations on ARM–NEON processors. Unlike previous implementations, we mix-use both ARM and NEON instructions to hide computation latency for ARM into NEON. Since ARM and NEON modules are separated units, the assignments are successfully issued independently. The integer multiplication and squaring are finely divided into several sub-tasks and the sub-tasks are properly assigned to ARM and NEON in order to balance the workloads. Finally, the proposed implementations outperform the best-known results on the identical ARM–NEON processors by 22.4% and 18.3% for 2048-bit integer multiplication and squaring, respectively.
更多
查看译文
关键词
Parallel implementation, ARM, NEON, Co-design, Multiplication, Squaring, Public key cryptography
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要