Subwords in Reverse-Complement Order

Péter L. Erdős^{1}, Péter Ligeti^{2}, Péter Sziklai^{2}, and David C. Torney^{3}

elp@renyi.hu

{turul, sziklai}@cs.elte.hu

dtorney@earthlink.net

Annals of Combinatorics 10 (4) p.415-430 December, 2006

Abstract:

We examine finite words over an alphabet of pairs of letters, where
each word *w*_{1}*w*_{2}…*w*_{t} is identified with
its *reverse complement*
(where ). We seek the smallest *k* such that every word of length *n*, composed from
Γ, is uiquely determined by the set of its subwords of length up to *k*. Our almost sharp result is an analogue of
a classical result for “normal” words. This problem has its roots in bioinformatics.

References:

1. A.W.M. Dress and P.L. Erdõs, Reconstructing words from subwords in linear time, Ann. Combin. 8 (4) (2004) 457-462.

2. A.G. D'yachkov, P.L. Erdõs, A.J. Macula, V.V. Rykov, D.C. Torney, C.-S. Tung, P.A. Vilenkin, and P.S. White, Exordium for DNA Codes, J. Comb. Optim. 7 (4) (2003) 369- 379.

3. V.I. Levenshtein, On perfect codes in deletion and insertion metric, Discrete Math. 3 (1) (1991) 3-20; Translation in Discrete Math. Appl. 2 (1992) 241-258.

4. V.I. Levenshtein, Efficient reconstruction of sequences from their subsequences or supersequences, J. Combin. Theory Ser. A 93 (2001) 310-332.

5. V.I. Levenshtein, Efficient reconstruction of sequences, IEEE Trans. Inform. Theory 47 (1) (2001) 2-22.

6. M. Lothaire, Combinatorics on Words, Encyclopedia of Mathematics and its Applications 17, Addison-Wesley, Reading, Mass., 1983.

7. J. Manuch, Characterization of a word by its subwords, In: Developments in Language Theory, G. Rozenberg, et al. Ed., World Scientific Publ. Co., Singapore, (2000) pp. 210- 219.

8. I. Simon, Piecewise testable events, Lecture Notes in Comput. Sci. 33 (1975) 214-222.