vmakinen@cs.Helsinki.FI
Department of Computer Science, P.O Box 26 (Teollisuuskatu 23)
FIN-00014 University of Helsinki, Finland.
Abstract. Suffix array is a widely used full-text index that allows fast searches
on the text. It is constructed by sorting all suffixes of the text in the
lexicographic order and storing pointers to the suffixes in this order.
Binary search is used for fast searches on the suffix array.
Compact suffix array is a compressed form of the suffix array that
still allows binary searches, but the search times are also dependent on
the compression. In this paper, we answer some open questions concerning
the compact suffix array, and study practical issues, such as the trade off
between compression and search times, and show how to reduce the space requirement
of the construction. Experimental results are provided in comparison with other
search methods. The results show that usually the size of a compact suffix array is less than twice
the size of the text, while the search times are still comparable to those of suffix
arrays.