Fast Hardware Implementation for Extended GCD of Large Numbers in Redundant Representation

IEEE Transactions on Circuits and Systems II: Express Briefs(2023)

引用 0|浏览3
暂无评分
摘要
The extended greatest common divisor (XGCD) of large numbers is a commonly used and computing-intensive operation in cryptography. The growing demands for high-speed cryptography applications urge for fast XGCD implementation. However, it is challenging to design fast architectures for XGCD because of its complex operations and tight data dependency. In this brief, we propose a low-latency and high-efficiency architecture for XGCD by using a modified extended k-ary algorithm in redundant data representation. We modify the original k-ary algorithm by adopting a parameter d to reduce computing delay and applying a hardware-friendly method to Bezout coefficients. The redundant signed digit (RSD) representation is selected to avoid carry propagation and achieve high clock frequency. We devise high-parallel and low-latency architectures for the proposed algorithms with k = 2, 4, and 8. The designs are coded in SystemVerilog and synthesized using TSMC 28-nm CMOS technology. The implementation results show that for the data bit width of 1024, the design with k = 8 achieves the shortest latency among different values of k, which is about 1.5x faster than the prior art and only needs 472 ns.
更多
查看译文
关键词
Extended GCD,low-latency,blockchain,ASIC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要