Annals of Combinatorics 4 (2000) 247-255
Avoided Strings in Bacterial Complete Genomes and a Related Combinatorial Problem
Bailin Hao, Huimin Xie, Zuguo Yu, and Guoyi Chen
Institute of Theoretical Physics, Academia Sinica, P. O. Box 2735, Beijing
Received December 12, 1998
AMS Subject Classification: 05A15, 92C40
Abstract. The visualization of avoided and under-represented strings in some bacterial complete genomes raises a combinatorial problem which may be solved either by using the Goulden-Jackson cluster method or by construction of the minimal finite automaton defined by the set of forbidden words of the corresponding language.
Keywords: complete genomes, avoided strings, language, enumeration, fractal
1. G. Deckert et al., The complete genome of the hyperthermophilic bacterium Aquifex aeolicus, Nature 392 353–358.
2. I. Goulden and D.M. Jackson, An inversion theorem for cluster decomposition of sequences with distinguished subsequences, J. London Math. Soc. 20 (1979) 567–576.
3. L.J. Guibas and A.M. Odlyzko, Periods in strings, J. Combin. Theory A 30 (1981) 19–42.
4. B.-L. Hao, H.-C. Lee, and S.-Y. Zhang, Fractals related to long DNA sequences and complete genomes, Chaos, Solitons and Fractals, 11 (2000) 825–836.
5. B.-L. Hao and W.-M. Zheng, Applied Symbolic Dynamics and Chaos, World Scientific, Singapore, 1998.
6. J. Noonan and D. Zeilberger, The Goulden–Jackson cluster method: extensions, applications and implementations; http://www.math.temple.edu/ ˜zeilberg.
7. N.J.A. Sloane and S. Plouffe, The Encyclopedia of Integer Sequences, Academic Press, 1995; http://akpublic.research.att.com/~njas/sequences.
8. S. Wolfram, Computation theory of cellular automata, Commun. Math. Phys. 96 (1984) 15–57.
9. Huimin Xie, Grammatical Complexity and One-Dimensional Dynamical Systems, World Scientific, Singapore, 1996.