AbstractFinite-state transducers and finite-state automata are efficient and natural representations for a large variety of problems. We describe a new algorithm for turning a finite-state transducer into the composition of two deterministic finite-state transducers such that the combined size of the derived transducers can be exponentially smaller than other known deterministic constructions. As a consequence, this can also be used to build deterministic representations of finite-state automata exponentially smaller than the deterministic minimal finite-state automata computed by the classic determinization and minimization algorithms. We also report experimental results on large-scale dictionaries and rule-based systems.
Categories and Subject Descriptors: E.1 [Data Structures]; E.2 [Data Storage Representations]; I.2.7 [Artificial Intelligence]: Natural Language Processing
Additional Key Words and Phrases: finite-state automaton, finite-state transducer, bimachine, regular expression
Selected references
- Christophe Reutenauer and Marcel-Paul Schutzenberger. Minimization of rational word functions. SIAM Journal on Computing, 20(4):669-685, August 1991.
- M. P. Schützenberger. A remark on finite transducers. Information and Control, 4(2-3):185-196, September 1961.