Statistics for Phylogenetic Trees in the Presence of Stickiness
arxiv(2024)
Abstract
Samples of phylogenetic trees arise in a variety of evolutionary and
biomedical applications, and the Fréchet mean in Billera-Holmes-Vogtmann tree
space is a summary tree shown to have advantages over other mean or consensus
trees. However, use of the Fréchet mean raises computational and statistical
issues which we explore in this paper. The Fréchet sample mean is known often
to contain fewer internal edges than the trees in the sample, and in this
circumstance calculating the mean by iterative schemes can be problematic due
to slow convergence. We present new methods for identifying edges which must
lie in the Fréchet sample mean and apply these to a data set of gene trees
relating organisms from the apicomplexa which cause a variety of parasitic
infections. When a sample of trees contains a significant level of
heterogeneity in the branching patterns, or topologies, displayed by the trees
then the Fréchet mean is often a star tree, lacking any internal edges. Not
only in this situation, the population Fréchet mean is affected by a
non-Euclidean phenomenon called stickness which impacts upon asymptotics, and
we examine two data sets for which the mean tree is a star tree. The first
consists of trees representing the physical shape of artery structures in a
sample of medical images of human brains in which the branching patterns are
very diverse. The second consists of gene trees from a population of baboons in
which there is evidence of substantial hybridization. We develop hypothesis
tests which work in the presence of stickiness. The first is a test for the
presence of a given edge in the Fréchet population mean; the second is a
two-sample test for differences in two distributions which share the same
sticky population mean.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined