Parallel Suffix Sorting for Large String Analytics

Parallel Processing and Applied Mathematics(2023)

引用 0|浏览15
暂无评分
摘要
The suffix array is a fundamental data structure to support string analysis efficiently. It took about 26 years for the sequential suffix array construction algorithm to achieve $$\mathcal {O}(n)$$ time complexity and in-place sorting. In this paper, we develop the D-Limited Parallel Induce (DLPI) algorithm, the first $$\mathcal {O}(\frac{n}{p})$$ time parallel suffix array construction algorithm. The basic idea of DLPI includes two aspects: dividing the $$\mathcal {O}(n)$$ size problem into p reduced sub-problems with size $$\mathcal {O}(\frac{n}{p})$$ so we can handle them on p processors in parallel; developing an efficient parallel induce sorting method to achieve correct order for all the reduced sub-problems. The complete algorithm description is given to show the implementation method of the proposed idea. The time and space complexity analysis and proof are also given to show the correctness and efficiency of the proposed algorithm. The proposed DLPI algorithm can handle large strings with scalable performance.
更多
查看译文
关键词
Suffix Array, String Algorithm, Parallel Sorting, String Analysis, Optimal Algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要