Probabilities on Phylogenetic Trees

James Francis Lyons-Weiler weiler at ERS.UNR.EDU
Tue Sep 9 19:21:00 CDT 1997


There are, as Stuart Poss intimated, many ways of thinking about the
probabilities of phylogenetic trees.  It's important to consider which
mode of thought one is in.

The P(T) can refer to

the probability of a given tree (terminal nodes labelled), which is not
really a question that gets us very far, except to appreciate that in
general, any tree is a possibility as a hypothesis, and that in the model
of evolution we carry around in our heads, any tree may manifest itself
during the course of evolution of a group.  This does not necessarily mean
that all topologies (tree shapes) can be expected with an equal frequency
under these conditions; in fact, the tree shape people have found that
a null birth - death process does not yield equal probabilites of all
topologies.

This is, by the way, the P(T) Stuart Poss seems to be referring to, but it
is not the P(T) that Richard Zander is referring to.

Another P(T) is perhaps better annotated as p(t = T); not lc p denotes an
estimate of the probability, where P refers to the real probability that
in fact applies to an event (i.e., whosoever uses P is omniscient e.g.,
simulations...).  t then refers to the tree returned under a criterion;
p(t = T) is the probability that the estimate t is the same as T.

As far as I can tell, Richard is referring to p(t = T), and not P(T); his
arguments presume that the evolutionary truth is foregone; so the jist of
the argument is that p(t = T) appears to low under most criteria in
general.  Some folks don't think in terms of p(t = T); they like to think
in terms of p(t resembles T).  I prefer p(t = T) because the people
outside the field tend to think that a published tree estimate must be
true.  I've heard ecologists say that the phylogeny for this group is
known, for instance.  What a BAD thing for our science.  There is a
finite number of groups to study, and we could publish our science into
oblivion.  Importantly (I think VERY importantly), proponents of methods
of phylogenetic inference do the entire endeavor a major disservice
unless they state clearly and loudly the limitations of their favorite
methods as they currently understand them.  People may adopt their
methods if they list all the reasons why its useful, but then too much
confidence is placed in trees that are, at best, gross estimates and are
not reconstructions in the true sense.  If my vase breaks, I can
accurately reconstruct it because I know what it looked like.   If
someone send my their broken vase, and I have never seen the complete
object, I can attempt to reconstruct it, but that's all.  It's a subtle
distinction, I admit, but the use of the term reconstruction reveals an
overly positivistic approach.  The same problem plagues the definition of
synapomorphy - something we get from a tree.  For me, they exist whether
we draw trees or not.  Archie suggests the use of "evolutionary homoplasy"
to refer to true homoplasy, evolutionay synapomorphy, and so on.

A final thought on p(t = T); this can in fact depend on T.  That makes
making the statistical analysis of p(t = T) extremely difficult, but
certainly not impossible.  It's worth doing even if only to learn about
how many chances in hades we have of acheiving T.

James




More information about the Taxacom mailing list