Iterated Straight-Line Programs

Gonzalo Navarro, Cristian Urbina

CoRR(2024)

引用 0|浏览1
暂无评分
摘要
We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure δ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial in areas like Bioinformatics). The extension, called iterated SLPs (ISLPs), allows rules of the form A →Π_i=k_1^k_2 B_1^i^c_1⋯ B_t^i^c_t, for which we show how to extract any substring of length λ, from the represented text T[1.. n], in time O(λ + log^2 nloglog n). This is the first compressed representation for repetitive texts breaking δ while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. As a byproduct, we extend Ganardi et al.'s technique to balance any SLP (so it has a derivation tree of logarithmic height) to a wide generalization of SLPs, including ISLPs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要