谷歌浏览器插件
订阅小程序
在清言上使用

Parallel Hardware Merge Sorter

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2016)

引用 62|浏览53
暂无评分
摘要
Sorting has tremendous usage in the applications that handle massive amount of data. Existing techniques accelerate sorting using multiprocessors or GPGPUs where a data set is partitioned into disjunctive subsets to allow multiple sorting threads working in parallel. Hardware sorters implemented in FPGAs have the potential of providing high-speed and low-energy solutions but the partition algorithms used in software systems are so data dependent that they cannot be easily adopted. The speed of most current sequential sorters still hangs around 1 number/cycle. Recently a new hardware merge sorter broke this speed limit by merging a large number of sorted sequences at a speed proportional to the number of sequences. This paper significantly improves its area and speed scalability by allowing stalls and variable sorting rate. A 32-port parallel merge-tree that merges 32 sequences is implemented in a Virtex-7 FPGA. It merges sequences at an average rate of 31.05 number/cycle and reduces the total sorting time by 160 times compared with traditional sequential sorters.
更多
查看译文
关键词
Merge sort,sorting network,parallel sorting,FPGA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要