TKTL / Guest Lecture

U N I V E R S I T Y O F H E L S I N K I
D E P A R T M E N T O F C O M P U T E R S C I E N C E

Guest lectures

Prof. Amihood Amir

Bar-Ilan University

Pattern Matching with Swaps

and

Prof. Gad Landau

University of Haifa

Matching for Run-Length Encoded Strings

Date Thursday, September 11th

Place
Department of Computer Science,
Teollisuuskatu 23 room A414

Time 14:15 - 16

Abstracts

A. Amir: Pattern Matching with Swaps
Let a text string $T$ of $n$ symbols and a pattern string $P$ of $m$ symbols from alphabet $\Sigma$ be given. A {\em swapped version} $T'$ of $T$ is a length $n$ string derived from $T$ by a series of {\em local swaps}, (i.e.\ $t'_{\ell} \leftarrow t_{\ell +1}$ and $t'_{\ell +1} \leftarrow t_{\ell}$) where each element can participate in {\em no more than one swap}.
The {\em Pattern Matching with Swaps} problem is that of finding all locations $i$ for which there exists a swapped version $T'$ of $T$ where there is an exact matching of $P$ in location $i$ of $T'$. It has been an open problem whether swapped matching can be done in less than $O(mn)$ time.
We show the first algorithm that solves the pattern matching with swaps problem in time $o(mn)$. We present an algorithm whose time complexity is $O(n m^{1/3} \log m \log^2 \sigma)$ for a general alphabet $\Sigma$, where $\sigma = min(m,|\Sigma|)$.

G. Landau: Matching for Run-Length Encoded Strings
Measuring the similarity between two strings, through such standard measures as Hamming distance, edit distance, and longest common subsequence, is one of the fundamental problems in pattern matching. We consider the problem of finding the longest common subsequence of two strings. A well-known dynamic programming algorithm computes the longest common subsequence of strings $X$ and $Y$ in $O(|X| \cdot |Y|)$ time. In this paper, we develop significantly faster algorithms for a special class of strings which emerge frequently in pattern matching problems.
A string $S$ is run-length encoded if it is described as an ordered sequence of pairs $(\sigma,i)$, each consisting of an alphabet symbol $\sigma$ and an integer $i$. Each pair corresponds to a run in $S$ consisting of $i$ consecutive occurrences of $\sigma$. Such a run-length encoded string can be significantly shorter than the expanded string representation.
In this paper, we present the first algorithm which finds the longest common subsequence of strings $X$ and $Y$ in time polynomial in the size of the compressed strings. Our final algorithm runs in $O(k l \log ( k l ) )$ time, where $k$ and $l$ are the compressed lengths of strings $X$ and $Y$.
(Joint work with Alberto Apostolico and Steven Skiena)

Date	Thursday, September 11th
Place	Department of Computer Science, Teollisuuskatu 23 room A414
Time	14:15 - 16

Guest lectures

Prof. Amihood Amir

Pattern Matching with Swaps

Prof. Gad Landau

Matching for Run-Length Encoded Strings

Abstracts

Tervetuloa!