Finding Missing Patterns
Shunsuke Inenage, Teemu Kivioja, and Veli Mäkinen
Abstract.
Consider the following problem: Find the shortest pattern that
does not occur in a given text. To make the problem
non-trivial, the pattern is required to consist only of characters
that occur in the text. This problem can be easily solved in linear
time using the suffix tree of the text. In this paper, we study an
extension of this problem, namely the missing patterns problem:
Find the shortest pair of patterns that do not occur close to
each others in a given text, i.e., the distance between their
occurrences is always greater than a given threshold R. We
show that the missing patterns problem can be solved in O(\min(R
n log n,n^2)) time, where n is the size of the text. For the
special case where both pairs are required to have the same length, we
give sub-quadratic solutions. The problem is motivated by optimization
of multiplexed nested-PCR.