<%@ Page Language="C#" MasterPageFile="~/Main.master" AutoEventWireup="true" Title="Volume10 Issue1" %>
The Mean and Variance of the Numbers of r-Pronged Nodes and r-Caterpillars in Yule-Generated Genealogical Trees
Noah A. Rosenberg
Department of Human Genetics and Bioinformatics Program, University of Michigan, 2017 Palmer Commons, 100 Washtenaw Ave Ann Arbor, MI 48109-2218, USA
noahr@usc.edu
Annals of Combinatorics 10 (1) p.129-146 March, 2006
AMS Subject Classification: 05C05, 92D15
Abstract:
The Yule model is a frequently-used evolutionary model that can be utilized to generate random genealogical trees. Under this model, using a backwards counting method differing from the approach previously employed by Heard (Evolution 46: 1818-1826), for a genealogical tree of n lineages, the mean number of nodes with exactly r descendants is computed (2 ≤ r ≤ n-1). The variance of the number of r-pronged nodes is also obtained, as are the mean and variance of the number of r-caterpillars. These results generalize computations of McKenzie and Steel for the case of r = 2 (Math. Biosci. 164: 81-92, 2000). For a given n, the two means are largest at r = 2, equaling 2n/3 for n ≥ 5. However, for n ≥ 9, the variances are largest at r = 3, equaling 23n/420 for n ≥ 7. As n → , the fraction of internal nodes that are r-caterpillars for some r approaches (e2 - 5)/4 ≈ 0:59726.
Keywords: binary search tree, cherries, coalescent, genealogy, labeled topology, pectinate

References:

1. D. Aldous, Probability distributions on cladograms, In: Discrete Random Structures, D. Aldous and R. Pemantle, Eds., Springer-Verlag, New York, (1996) pp. 1--18.

2. D.J. Aldous, Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today, Statist. Sci. 16 (2001) 23--34.

3. M.G.B. Blum and O. François, On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited, Math. Biosci. 195 (2005) 141--153.

4. J.K.M. Brown, Probabilities of evolutionary trees, Syst. Biol. 43 (1994) 78--91.

5. J.H. Degnan and L.A. Salter, Gene tree distributions under the coalescent process, Evolution 59 (2005) 24--37.

6. L. Devroye, Limit laws for local counters in random binary search trees, Random Structures Algorithms 2 (1991) 303--316.

7. L. Devroye, Limit laws for sums of functions of subtrees of random binary search trees, SIAM J. Comput. 32 (2003) 152--171.

8. A.W.F. Edwards, Estimation of the branch points of a branching diffusion process, J. Roy. Statist. Soc. Ser. B 32 (1970) 155--174.

9. J. Felsenstein. Inferring Phylogenies, Sinauer, Sunderland, MA, 2004.

10. Y.X. Fu, Statistical properties of segregating sites, Theor. Pop. Biol. 48 (1995) 172--197.

11. H.W. Gould, Combinatorial Identities, Gould Publications, Morgantown, WV, 1972.

12. E.F. Harding, The probabilities of rooted tree-shapes generated by random bifurcation, Adv. Appl. Probab. 3 (1971) 44--77.

13. S.B. Heard, Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees, Evolution 46 (1992) 1818--1826.

14. R.R. Hudson, Gene genealogies and the coalescent process, Oxf. Surv. Evol. Biol. 7 (1990) 1--44.

15. J.F.C. Kingman, On the genealogy of large populations, J. Appl. Probab. 19A (1982) 27--43.

16. W.P. Maddison and M. Slatkin, Null models for the number of evolutionary steps in a character on a phylogenetic tree, Evolution 45 (1991) 1184--1197.

17. H.M. Mahmoud, Evolution of Random Search Trees, Wiley, New York, 1992.

18. A. McKenzie and M. Steel, Distributions of cherries for two models of trees, Math. Biosci. 164 (2000) 81--92.

19. A.O. Mooers and S.B. Heard, Evolutionary process from phylogenetic tree shape, Quart. Rev. Biol. 72 (1997) 31--54.

20. A.O. Mooers and S.B. Heard, Using tree shape, Syst. Biol. 51 (2002) 833--834.

21. F. Murtagh, Counting dendrograms: a survey, Discrete Appl. Math. 7 (1984) 191--199.

22. J.E. Neigel and J.C. Avise, Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation, In: Evolutionary Processes and Theory, S. Karlin and E. Nevo, Eds., Academic Press, New York, (1986) pp. 515--534.

23. M. Nordborg, On the probability of Neanderthal ancestry, Amer. J. Hum. Genetic 63 (1998) 1237--1240.

24. M. Nordborg, Coalescent theory, In: Handbook of Statistical Genetics, Chapter 7, D.J. Balding, M. Bishop, and C. Cannings, Eds., Wiley, Chichester, UK, (2001) pp. 179--212.

25. R.D.M. Page, Random dendrograms and null hypotheses in cladistic biogeography, Syst. Zool. 40 (1991) 54--62.

26. P. Pamilo and M. Nei, Relationships between gene trees and species trees, Mol. Biol. Evol. 5 (1988) 568--583.

27. M. Petkovšek, H.S. Wilf, and D. Zeilberger, A = B, Peters, Wellesley, MA, 1996.

28. J.A. Rice, Mathematical Statistics and Data Analysis, 2nd edition, Duxbury Press, Belmont, CA, 1995.

29. N.A. Rosenberg, The probability of topological concordance of gene trees and species trees, Theoret. Popul. Biol. 61 (2002) 225--247.

30. N.A. Rosenberg, The shapes of neutral gene genealogies in two species: probabilities of monophyly, paraphyly, and polyphyly in a coalescent model, Evolution 57 (2003) 1465--1477.

31. N.A. Rosenberg, Gene genealogies, In: Evolutionary Genetics: Concepts and Case Studies, C.W. Fox and J.B.Wolf, Eds., Oxford University Press, Oxford, 2005.

32. C. Semple and M. Steel, Phylogenetics, Oxford University Press, Oxford, 2003.

33. J.B. Slowinski, Probabilities of n-trees under two models: a demonstration that asymmetrical interior nodes are not improbable, Syst. Zool. 39 (1990) 89--94.

34. J.B. Slowinski and C. Guyer, Testing the stochasticity of patterns of organismal diversity: an improved null model, Amer. Naturalist 134 (1989) 907--921.

35. M. Steel and A. McKenzie, Properties of phylogenetic trees generated by Yule-type speciation models, Math. Biosci. 170 (2001) 91--112.

36. J. Stone and J. Repka, Using a nonrecursive formula to determine cladogram probabilities, Syst. Biol. 47 (1998) 617--624.

37. J.R. Stone, Probabilities for completely pectinate and symmetric cladograms, Cladistics 19 (2003) 565--566.

38. F. Tajima, Evolutionary relationship of DNA sequences in finite populations, Genetics 105 (1983) 437--460.

39. G.U. Yule, A mathematical theory of evolution based on the conclusions of Dr. J.C. Willis, F.R.S. R. Soc. Lond. Proc. Ser. B 213 (1924) 21--87.