Annals of Combinatorics 3 (1999) 1-12
Patching Up X-Trees
Sebastian Böcker1, Andreas W.M. Dress1, and Mike A. Steel 2
1GK Struktursprozesse, FSP Mathematisierung, Universität Bielefeld, PF 100 131, 33501 Bielefeld, Germany
2Biomathematics Research Centre, University of Canterbury, Private Bag 4800, Christchurch, New Zealand
Received January 8, 1999
AMS Subject Classification: 05C05, 92D15
Abstract.A fundamental problem in many areas of classification, and particularly in biology, is the reconstruction of a leaf-labeled tree from just a subset of its induced subtrees. Without loss of generality, we may assume that these induced subtrees all have precisely four leaves. Of particular interest is determining whether a collection of quartet subtrees uniquely defines a parent tree. Here, we solve this problem in the case where the collection of quartet trees is of minimal size, by studying encodings of binary trees by such quartet trees. We obtain a characterization of minimal encodings that exploits an underlying ``patchwork'' structure. As we will show elsewhere, this allows one to obtain a polynomial time algorithm for certain instances of the problem of reconstructing trees from subtrees.
Keywords: trees, supertrees, tree amalgamation, quartet encodings of trees, hierarchies, patchworks
1. H.-J. Bandelt and A. Dress, Reconstructing the shape of a tree from observed dissimilarity data, Adv. in Appl. Math. 7 (1986) 309–343.
2. S. Böcker, D. Bryant, A.W. Dress, and M. Steel, Algorithmic aspects of tree amalgamation, in preparation.
3. S. Böcker and A.W. Dress, Patchworks, 1999, to appear in Adv. in Math.
4. S. Böcker, A.W. Dress, and M. Steel, Most parsimonious quartet encodings of binary trees, in preparation.
5. P. Buneman, The recovery of trees from measures of dissimilarity, In: Mathematics in the Archaeological and Historical Sciences, F. Hodson, D. Kendall, and P. Tautu, Eds., Edinburgh University Press, Edinburgh, 1971, pp. 387–395.
6. R.L. Cann, M. Stoneking, and A.C. Wilson, Mitochondrial DNA and human evolution, Nature 325 (1987) 31–36.
7. H. Colonius and H.-H. Schulze, Tree structures for proximity data, British J. Math. Statist. Psych. 34 (1981) 167–180.
8. H. Colonius and H.-H. Schulze, Repräsentation nichtnumerischer Ähnlichkeitsdaten durch Baumstrukturen, Psych. Beitr. 21 (1979) 98–111.
9. P.L. Erdös, L.A. Székely, M. Steel, and T. Warnow, A few logs suffice to build (almost) all trees, Random Structures Algorithms 14 (1999) 153–184.
10. D. Huson, S. Nettles, L. Parida, T. Warnow, and S. Yooseph, The disk-covering method for tree reconstruction, In: Proceedings of “Algorithms and Experiments” (ALEX98), Trento, Italy, Feb. 9–11, 1998, R. Battiti and A. Bertossi, Eds., 1998, pp. 62–75. Available from WWWsite http://rtm.science.unitn.it/alex98/proceedings.html.
11. D. Huson and T.J. Warnow, Obtaining highly accurate topology and evolutionary estimates of evolutionary trees from very short sequences, accepted for RECOMB 99.
12. P.M. Robinson, Computer-assisted stemmatic analysis and ‘best-text’ historical editing, In: Studies in Stemmatology, P. Reenen and M. van Mulken, Eds., John Benjamins Publishing, Amsterdam, 1996, pp. 71–103.
13. E. Schröder, Vier Kombinatorische Probleme., Z. Math. Phys. 15 (1870) 361–376.
14. M. Steel, The complexity of reconstructing trees from qualitative characters and subtrees, J. Classification 9 (1992) 91–116.
15. M. Steel, A.W. Dress, and S. Böcker, Some simple but fundamental limitations on supertree and consensus tree methods, 1999, submitted.
16. K. Strimmer and A. von Haeseler, Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies, Mol. Biol. Evol. 13 (1996) 964–969.
17. D.L. Swofford, G.J. Olsen, P.J.Waddell, and D.M. Hillis, Phylogenetic inference, Molecular Systematics, D. Hillis, C.Moritz, and B.Marble, Eds., Sinauer Associates, second ed., 1996, pp. 407–514.
18. T. Warnow, Mathematical approaches to comparative linguistics, Proc. Nat. Acad. Sci. U.S.A. 94 (1997) 6585–6590.
19. S.J. Willson, Measuring inconsistency in phylogenetic trees, J. Theor. Biol. 190 (1998) 15– 36.