Parallel CRC optimisations on the x64 architecture: a per-partes method

Jiri Kaspar,Ivan Simecek

INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS(2024)

引用 0|浏览1
暂无评分
摘要
Each generation of CPU provides more resources and new features. These increase the ability to perform algorithms faster and with a higher degree of parallelism. The article discusses methods used to optimise CRC generation algorithms for long data blocks with consideration of the capabilities of contemporary systems. We analysed known software CRC algorithms and combined all known principles into a solution scalable in multiple CPU cores on single and multi-socket systems. Various algorithms were evaluated on contemporary multicore systems with 1 x 4, 1 x 64, 2 x 12, and 4 x 26 cores. The results show how the performance is affected by the architecture of the memory subsystem. Compared to the original sequential Sarwate algorithm, our algorithms are 48.0, 51.1, 38.0, and 28.8 times faster. [GRAPHICS] .
更多
查看译文
关键词
CRC,table-driven CRC calculation,parallel CRC calculation,multi-core programming,multithreading
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要