Dynamic Entropy-Compressed Sequences and Full-Text Indexes.
Veli Mäkinen and Gonzalo Navarro
Given a sequence of n bits with binary zero-order entropy H_0, we present
a dynamic data structure that requires nH_0 + o(n) bits of space, which is
able of performing rank and select, as well as inserting and deleting bits
at arbitrary positions, in O(log n) worst-case time. This extends previous
results by Hon et al. [ISAAC 2003] achieving O(log n/ loglog n) time for
rank and select but O(polylog(n)) amortized time for inserting and
deleting bits, and requiring n+o(n) bits of space; and by Raman et
al. [SODA 2002] which have constant query time but a static structure.
In particular, our result becomes the first entropy-bound dynamic data
structure for rank and select over bit sequences.
We then show how the above result can be used to build a dynamic full-text
self-index for a collection of texts over an alphabet of size c, of
overall length n and zero-order entropy H_0. The index requires
nH_0 + o(n log c) bits of space, and can count the number of
occurrences of a pattern of length m in time O(m log n log c).
Reporting the occ occurrences can be supported in O(occ log^2 n log
c) time, paying O(n) extra space. Insertion of text to the collection
takes O(log n log c) time per symbol, which becomes
O(log^2 n log c) for deletions.
This improves a previous result by Chan et al. [CPM 2004].
As a consequence, we obtain an O(n log n log c) time construction
algorithm for a compressed self-index requiring nH_0 + o(n log c) bits
working space during construction.