Parallel SHA-256 on SW26010 many-core processor for hashing of multiple messages

JOURNAL OF SUPERCOMPUTING(2022)

引用 2|浏览6
暂无评分
摘要
To explore whether new parallelism techniques can provide additional performance improvements in cryptographic hash functions, we conducted our study with the SW26010, which is a special-architecture processor on Sunway TaihuLight, one of the world’s fastest supercomputers. Secure Hash Algorithms (SHAs) are significant for secure transmission, with SHA-256 remaining a safe and most efficient SHA design. We propose SW-SHA-256, a parallel SHA-256 implementation for hashing of multiple messages on the SW26010. Our work explores the parallel schemes at the instruction and thread levels. At the instruction level, we use vector registers to load multiple messages to complete hashing simultaneously. Assembly-level optimization methods such as dual issue are used, and the pipeline is distinct from that of a general-purpose processor. At the thread level, the optimized DMA transmission strategy and double buffer technique are used to reduce the cost from memory to cache. As a result, we obtain 5.87 cycles per byte in a single core which is 8.18X speed up faster than the C code in OpenSSLv3.0.0. Moreover, our implementation achieves a throughput of 60.21 GB/s on a SW26010 processor and is highly scalable.
更多
查看译文
关键词
SW26010, SHA-256, Multiple messages, Instruction level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要