Efficient GPU Implementations of Post-Quantum Signature XMSS

IEEE Transactions on Parallel and Distributed Systems(2023)

引用 2|浏览10
暂无评分
摘要
The National Institute of Standards and Technology (NIST) approved XMSS as part of the post-quantum cryptography (PQC) development effort in 2018. XMSS is currently one of only two standardized PQC algorithms, but its performance limits its use. For example, the fastest record for some standardized parameters still takes more than a minute to generate a keypair. In this article, we present the first GPU implementation for XMSS and its variant XMSS $^{\mathsf {MT}}$ . The high parallelism of GPUs is especially effective for reducing latency in key generation and improving throughput for signing and verifying. In order to meet various application scenarios, we provide three parallel XMSS schemes: algorithmic parallelism , multi-keypair data parallelism , and single-keypair data parallelism . For these schemes, we design custom parallel strategies that use more than 10,000 cores for all parameters provided by NIST. In addition, we analyze the availability of most previous serial optimizations and explore numerous techniques to fully exploit GPU performance. Our evaluations are made with the XMSSMT-SHA2_20/2_256 parameter set on a GeForce RTX 3090. The result shows the key generation latency is 3.20 ms, a speedup of 21,899× compared to the GPU ported version, which is also 54× speedup faster than the fastest work (174 ms). When 16384 tasks are executed, the throughput (task/s) for signing/verifying in the single-key and multi-key cases is 311,424/415,100 and 145,100/419,887, respectively. Compared to the throughput for signing/verifying (1695/4000) of the fastest work, we obtain a speedup of 184×/104× and 86×/105× in single-key and multi-key cases, respectively.
更多
查看译文
关键词
Post-quantum cryptography,stateful hash-based signatures,XMSS,XMSS $^{\mathsf {MT}}$ MT ,parallel computing,GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要