Iterated Straight-Line Programs
CoRR(2024)
摘要
We explore an extension to straight-line programs (SLPs) that outperforms,
for some text families, the measure δ based on substring complexity, a
lower bound for most measures and compressors exploiting repetitiveness (which
are crucial in areas like Bioinformatics). The extension, called iterated SLPs
(ISLPs), allows rules of the form A →Π_i=k_1^k_2
B_1^i^c_1⋯ B_t^i^c_t, for which we show how to extract any
substring of length λ, from the represented text T[1.. n], in time
O(λ + log^2 nloglog n). This is the first compressed representation
for repetitive texts breaking δ while, at the same time, supporting
direct access to arbitrary text symbols in polylogarithmic time. As a
byproduct, we extend Ganardi et al.'s technique to balance any SLP (so it has a
derivation tree of logarithmic height) to a wide generalization of SLPs,
including ISLPs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要