# [Fwd: Re: Probabilities on Phylogenetic Trees]

Tom DiBenedetto tdib at UMICH.EDU
Wed Sep 17 13:43:14 CDT 1997

```On Tue, 16 Sep 1997 12:53:42 -0700, James Francis Lyons-Weiler wrote:

[in response to my treatise on evidence using the analogy of a murder
investigation]

>        How can you narrow the scope down to those that have no
>        alibi, those that were being blackmailed, etc. unless
>        you have checked all 6 billion?  You can't really.

Right James, I really do need to check out every species on earth to
see whether it too has hair,,and you know what,,,,we have done that
for all known species, and will do that eventually for the
others,,and if it turns out that we find some critter with hair but
none of the other characteristics of mammals, we might learn
something new and unexpected, and this might even lead us to accept
some different trees,,,so?

>        So
>        you really rely on the covariance of the evidence,

I rely on the congruence of evidence. If this is a foreign concept to
you, then perhaps you are approaching systematics from the wrong
conceptual framework.

>        Statistics merely formalizes the process you describe.

Actually, we have a method which formalizes the process quite well,
thank you.

>        Our guesses include that the victim was not a rampant
>        blackmailer, for example (a Bayesian prior, whether
>        you admit it or not).

We may be diving a bit too deeply into the analogy, but I dont sense
that we make nor need to make such "guesses".

>>. The
>> probability of a result may be low in the face of no evidence, but is
>> high in the context of the evidence.
>
>        Huh?  The probability of the evidence is high in the
>        context of the evidence?  No kidding...

gee, I thought  I said the probability of the result.....as in the
probability that the butler did it is high, in the context of the
evidence, but would be low sans evidence.

>        It takes a healthly dose of positivism to say that the
>        tree we get from the evidence we collect then has a
>        high probability...

I said explicitly that it has a high probability *given the
evidence*,,
In plain english that means that if "piece of evidence A" indicates
grouping 1, and "piece of evidence B" indicates grouping 2 (which is
an internested subset of grouping 1) and "piece of evidence C"
indicates grouping 3 (which is an internested subset of grouping 2)
then overall pattern T is a highly probable general pattern for this
evidence if T contains grouping 3 nested within 2 nested within 1. If
pattern X has a somewhat different set of internested groupings it is
a less probable general result for this evidence.

>In fact, for the shortest tree,
>> it is maximized, relative to the evidence.
>
>        Huh, again?  The probability of a tree is maximized
>        relative to evidence if it is the shortest tree?
>        What does that mean? What are you saying?

The probability that a given tree is a reconstruction of the
phylogenetic
pattern is maximized for the tree which orders homologies
parsimoniously; i.e. with the minimal number of steps. Is
this really that complicated?

>        That
>        given the shortest tree, the shortest tree becomes
>        a highly probable event?

No James, that given the set of homologies, it is highly probable
that the shortest tree reflects the pattern of taxic divergence.

>        Others rampantly disagree.  The LESS probable a result
>        is, the more surprised we are, and the greater the
>        corrorborated the result is.

Well, I must admit that I am quite surprised by some of the results
of statistical phylogenetics. However I tend to feel that the results
gain corroboration from the amount of empirical evidence which
supports them, rather than from some quantification of my surprise at
the findings.
But I sense that you are embarking once again on your old mission to
find
some Popperian principle to turn against the cladists. But you are
mired, as ever, in a perspective which simply cannot see the
conceptual level at which systematics is practiced. Your sense of
what the "hypothesis" in systematics is, is firmly tied to the tree
itself. As I have tried to explain several times, systematics is
about the study of characters, and the formulation of hypotheses of
*homology* for those characters. The "bold hypotheses", the
'improbable results" which Popper alludes to, are found in
systematics in such statements as "all of these 4000 instances of
ectodermal outpocketings are really the same thing, and we will find
other characters which will display distribution patterns which are
congruent with this". Every homology hypothesis is also a grouping
hypothesis. This homology defines a monophyletic group. A parsimony
criterion is the implimentation of a *test* of these hypotheses (ever
hear of the "test of congruence"?). In good Popperian fashion we are
subjecting our hypotheses to a severe test (and the advocates of
"total evidence" are merely asserting that no hiding is allowed). The
parsimony criterion orders these individual grouping/homology
hypotheses into a hierarchy which reveals what congruence is present.
At this
step we are not trying to formulate bold, low probability hypotheses,
we are testing a set of hypotheses under a criterion which demands of
them that they be congruent.  The most parsimonious solution is the
set of homology hypotheses which survive this test, and since the
homology hypotheses are also grouping hypotheses, the groups which
emerge are accepted as those which are most consistent with what we

>        You can't discuss the
>        objective application of probability theory if
>        one never bothers to measure the probability of an
>        event.  A roll a die.  It lands on six.  What was the
>        P(6)?  I look at the evidence.  P(6) = 1.0?  No.
>        P(Tom will get an F on a probability exam) = ????

But James, I dont give a damn what the probability of 6 WAS, I care
about what the result IS. It is either 6 or it is not 6. You are
trying to get me to pretend that it is not 6 because 6 might be
highly improbable. You are telling me that the "C" at this site in
taxon A is not homologous to the "C" at that site in taxon B because
you think you know enough about how nucleotides change that you can
calculate that it is somewhat less probable they are homologous than
not. And I say that you dont know how nucleotides change in these
taxa, and you wont begin to make coherent
statements about how they do until AFTER you have a phylogeny in
hand.
The valid way to assess the homology of these "C"s is to test an
assertion of their homology against the expectations of the theory of
homology; i.e. that if these "C"s are homologous, their distribution
will be congruent with all other homologies described from all areas
of the organism. This is the test that the parsimony criterion
implements.

>        (For the uninitiated, Tom is immune to real criticism,
>        so my flames are not ever really felt by him).

luv ya too James,,,(and this is the guy who wrote yesterday "gee, why
cant we all get along?)???

>        Again, most statisticians would disagree.  We use
>        parsimony to estimate the population mean; in fact,
>        the sample mean is the value with minimum error around
>        it, and is the maximum likelihood estimate of the sample
>        mean when the proper assumptions apply. The fact that
>        we try to be precise in our measurements is a given.

Well that is nice, I am always happy to learn what statisticians do.
Now the day that you become interested in what systematists do, we
might begin to find some common ground..

>        The degree of confidence in a tree or group should be
>        a function of the low probability of that group given
>        an appropriate null.  Why?
>        Because the probability
>        that one or more shortest trees exist for any matrix
>        with variable character states is ca. 1.0,
>        and the probability'
>        that those trees will denote groups can be very high, regardless
>        of the process that generated the matrix.

If you fear that your sequences have been randomized, you could test
them against a null model, or, even better, you could test them
against evidence from other character systems,,in fact you could
throw in all character systems which have ever been studied in the
group, or even go out and study new ones.

>        , cladistics in vitro was process-
>        oriented... evidence of shared geneaological descent
>        invokes a multitude of processes, among them inheritance,
>        geneaology, birth, death...

participate in various and sundry processes, does not make cladistics
process-oriented.

>       and the evidence is
>        taken directly from the pattern of character state changes
>        on a parsimony tree (or so they thought).  So what that the
>        pattern is really not an observation, but rather is an
>        inference, a guess?

inference=guess? Is it a guess that some animals have vertebrae, and
that some vertebrates have lungs, and some osteichthyans have four
limbs, and some tetrapods have hair, and some hairy things do
statistics,,,is this a guess? or an inference? or an observation?
I'll go with the statement that we infer the pattern from
observations within an assumption of hierarchical order,,,but not a
guess.

>       So what if the degree to which we
>        might expect character state distributions to be hierarchically
>        distributed appears to require knowledge we can't ever have?

gee, James, the notions of homology and lineage branching are rather
basic in evolutionary biology.

>        Your process position is a straw man.  The degree to which
>        we might expect to see something is FAR different that the
>        degree to which we do in fact see something.  the question
>        is, are we observing something (an amount of pattern, a
>        short tree) that deserves a PHYLOGENETIC explanation, or
>        is it a result that could have happened by chance alone?

Or maybe god planted it all here on Oct.12 4004 BC,,how do we test
that? Sorry, but just for the fun of it, I have decided to confine my
activities to that domain which is bounded by the notions of descent
with modification as the explanation for regularites in organismal
character distribution. Not to be overly facetious, I think that if
you demonstrate that a particular result is within the range of
expectations from a random process, you must then demonstrate to me
that this process has some biological reality,,,that there really is
some process by which characters can be distributed randomly amongst
taxa (and yes, once again I will try to drag you out of your
exclusively ACTG world). Failing that, you do not have a legitimate
explanation for the pattern. If you suceed, then I would admit
that my pattern might have a shaky foundation,,,but what would I do
then? It is the best pattern I have,,,,maybe I would have to accept
it for now and look for more evidence!
(Horrors)

>        If the result (the degree of covariation in the implied
>'       hierarchy found in a matrix) can be easily dismissed as
>        a chance event, it (in total) doesn't require a geneaological
>        explanation.  Sure, genealogy and inheritance may have been
>        ongoing, but those processes and the processes of character
>        evolution may have interacted in ways that DESTROY or
>        mask the evidence of geneaology you expect to find on the
>        mpt. To say otherwise is to claim sufficient knowledge of
>        evolutionary processes.

Ah, but I dont. I just say that although that (randomness) might have
been demonstrated to be possible, it couldnt cause me to choose
another pattern, it merely might undermine some of the conviction
with which I assert my result. Given that investigatons are never
complete anyway, that lack of conviction would simply be a matter of
degree,,,I will go looking for more characters in any case, and hope
to make progress.
We all know about information destruction, James. That is why I
marvel at your adherence to a class of evidence in which this problem
is endemic; where because of the limited number of possible states,
evidence is destroyed at nearly every step. If the amount of
variation is sufficiently low across taxa for a particular sequence,
then I dont suppose any method will have many problems in finding
good results, and they probably will hold up well to all manner of
statsitsical testing. At the other extreme, when much of the
information has been overwritten, then no method has a prayer of
uncovering useful evidence. All the disputes center on the areas in
between, when a sufficient amount of evidence has been destroyed such
that the pattern is either hard to retrieve or spurious. At this
point, it seems that the statisticians, bound as they are by the
blinders of their genophilia, are committed to devising ever more
arcane ways in which to decide which evidence has and which hasnt
been destroyed, rather than looking to other character systems,
systems in which the problem (although certainly present) is present
to a far lesser degree. Is the goal to find the phylogenetic
relationships of the group? Or is the goal to develop the models of
genetic evolution? If the former, then seek out the evidence wherever
it exists. If the latter (wonderful and legitimate research program -
but not systematics), then accept that that is your field.
I dont know how much this applies to you personally, but since you
defend the statisticians, I'll send it your way.

```