Reconstructing Words from Subwords in Linear Time

AndreasW.M. Dress^{1} and Péter L. Erdős^{2}

dress@mis.mpg.de

elp@renyi.hu

Annals of Combinatorics 8 (4) p.457-462 December, 2004

Abstract:

Almost 30 years ago, M. Schützenberger and L. Simon established that two *n*-words
with letters drawn from a finite alphabet having identical sets of subwords of length up to are identical. In the context of coding theory, V.I. Levenshtein elaborated this result in a series
of papers. And further elaborations dealing with alphabets and sequences with reverse complementation
have been recently developed by P.L. Erdős, P. Ligeti, P. Sziklai, and D.C. Torney.
However, the algorithmic complexity of actually (re)constructing a word from its subwords has
apparently not yet explicitly been studied. This paper augments the work of M. Schützenberger
and L. Simon by showing that their approach can be reworked so as to provide a linear-time
solution of this reconstruction problem in the original setting studied in their work.

References:

1. P.L. Erdős, P. Ligeti, P. Sziklai, and D.C. Torney, Subwords in reverse-complement order, preprint, 2004.

2. V.I. Levenshtein, On perfect codes in deletion and insertion metric, Discrete Math. Appl. 2 (1992) 241--258.

3. V.I. Levenshtein, Efficient reconstruction of sequences from their subsequences or supersequences, J. Combin. Theory Ser. A 93 (2001) 310--332.

4. M. Lothaire, Combinatorics on Words, Addison-Wesley, Chapter 6, 1983, pp. 119--120.

5. I. Simon, Piecewise testable events, In: Automata Theory and Formal Languages, H. Brakhage ed., LNCS. 33 Springer Verlag, 1975, pp. 214--222.