[Fwd: Re: Probabilities on Phylogenetic Trees]

Tom DiBenedetto tdib at UMICH.EDU
Thu Sep 18 20:00:46 CDT 1997

 James Francis Lyons-Weiler wrote:
>        How can you say that parsimony is not probabilistic,
>        when the entire concept is based on the assumption
>        that evolutionary transformations are rare?

First off, I said it was probabilistic sensu lato. Secondly, what is
this rare assumption? Rare enough so that enough information has not
been overwritten (including in morphology) such that an underlying
pattern can be discerned? Well sheeesh, how is this different than
the assumption that one can in fact learn something about the history
of life from the study of characters, an assumption I think we all
must share?

>       You never
>        really address or responded to the point that
>        the "test" itself is influenced by how well this
>        assumption is met for a given data set.

I dont think your concerns are relevant to parsimony specifically. If
the true phylogenetic signal is so weak that spurious patterns emerge
from a parsimony analysis, how do you imagine that any other method
will be able to sort through the homoplasy and lock onto the true
pattern? How precise (and accurate) would any model have to be in
order to do that? And how would you devise such a model which would
be specific to the case where you cant figure out the phylogeny?
To the extent that I confront situations where my confidence in
parsimony methods fails, my confidence in other methods would fail
even more (unless of course, you could convince me they have the TRUE


>        Why not bother to ask if there are observable consequences
>        we can detect in patterning of states that would at
>        least indicate that long branches are a real possibility
>        (next year, MPE Feb/Mar)...

fine, I am always looking for reasons to avoid using a particular
molecular dataset
(that was a joke folks......)

>> See, I am really quite an agreeable guy!
>        I must say I concur.  You're nicer than some, too.
ah, you are a sweetie too James (I think they are gonna ask us to
take this private soon....)

>        Well, Popper's life work was to show that verificationism
>        sensu the Vienna circle was bankrupt, and succeeded in
>        doing so (3 papers with proof of the ABSENCE of probabilistic
>        support through inductive reasoning).

well, look forward to some of the Michigan crowd doing some mopping
up in the near future
>        You can't at once induce and deduce.  Which is it?
>        Your sets of hypotheses include in the background information
>        that some of the hypotheses (which through consilience
>        or congruence or whatever) provide a test are likely
>        to be false.  I don't feel comfortable relying on the
>        assumption that a majority of them are not false.

Well, to address the spirit of your complaint, I agree that the test
of congruence is not the most powerful test that could be
imagined,,,it cannot falsify all of the hypotheses at once,,but so
what? As you yourself have pointed out, the congruence test is only
the last test they are subject to, it is not the only test. And it is
simply in the nature of the procedure of hierarchical ordering that
somethings have to pass through the filter. But you specifically are
uneasy with an assumption that a majority of them are not false, and
I dont think that is the case. You can have homoplasy in every
character and still have a robust result. You can have a low CI and
still have a well supported pattern. I dont know if you want to start
breaking down the matrix in some sort of "units of homology
statements" and do a count on that. Nelson did something like that
with his three-item-analysis methods - I dont recall that anything
near a majority of these units needed to be, nor were expected to
emerge from the parsimony analysis unscathed.

>        The nice thing about
>        maximum likelihood (as opposed to say, I don't
>        know, cladistics?) is that the proponents of the
>        methods (some) have (tried) to state the limitations
>        of the methods clearly, they (some) have tried to
>        make all the assumptions of the methods obvious.

yes, that is a good thing,,they should continue...

>        A ratehr disturbing trends has started recently,
>        where (as you point out) students learn it, and
>        think it is gospel, and then publish astounding
>        discoveries like hypothesis testing in phylogenetics
>        in major science magazines, obviously understating
>        limitations, and overstating strengths.  But why
>        should we expect less?  Cladists and max lik people
>        are human... and humans tend to go to war over ideas.

'cept you and me, James,,,,right?

>       Where is the empirical proof of
>        mx pars as a critical test?  Where is atheoretical
>        proof?  I've searched the literature far and wide
>        for such papers. Consensus among people does not make
>        truth.

I'm not sure what you are looking for. If I study the hell out of a
character and am damn sure that it is homologous amongst a certain
group of taxa, and that it therefore defines a monophyletic group,
but this assertion is simply not congruent with all my other bold
assertions, and I am forced to accept the validity of a tree that
shows precious little respect for all the hard work I put into
understanding that character, then do we not have an instance of
empirical demonstration of the power of the congruence test to
falsify a hypothesis? Doesnt this happen, like,,,everyday?
What more are you looking for?

>       conceive of this..
>        imagine that a majority of your hypotheses of homology are
>        wrong.  You will find congruence, nevertheless,
>        by chance alone.  That much has been empirically
>        demonstrated by Archie and scores of others.

Oh yeah,,,I keep forgetting that the molecular types have this
problem....that it really is conceivable that most of their homology
hypotheses are wrong....(and no I dont think a simple majority is
enough to upset things). But hey,,,garbage in, garbage out,,,it is a
test of *congruence*,,not a test of *truth*.
So many of these complaints are really marginal or general to all
systematics, or to all science. No cladist has claimed that a
parsimony criterion will by itself take you from raw observation to
certain truth. That is why many of us are stubbornly drawn to
character systems which allow for some study and pre-parsimony
homology testing.

>> {my problem is} with the nonsense assertions by folks
>> like Felsenstein and Edwards, that parsimony is a statsitical method
>> (albeit a crude one), is model based (although the model remains
>> forever "implicit"),
>        Why implicit?  Are not transformation series probability
>        statements incarnate?

Hey, they are the ones that say "implicit" whenever I ask them what
model it is they are talking about. And no, transformation series are
not probaility statements incarnate.

>        I have NEVER heard Joe F. or Edwards, for that matter
>        state publicly that morphological data are not potentially
>        informative, or that sequences are the only way to go.

Well, Joe is a smart and careful guy. He doesnt explicitly preclude
anything, and chooses his words carefully. But he has developed tools
which are specific to sequences, and when I ask him how he intends to
incorporate morphological data into his findings (especially the data
we already have in hand), he has nothing to say (except this
open-ended muttering about devising models of morphological evolution
as well some day)

>        MY impetus for
>        creating statistical tests is biologically-oriented.
>        I came to the field to use the trees everyone was
>        excited about to help test historical hypotheses for
>        explanations of patterns of ecological diversity.. a noble
>        cause, yes?

yes indeed. not all that different from my own motivaton

>        But the trees for the groups I was
>        most interested in were not in any way from my
>        perspective critically generated... the taxonomists
>        followed the methodological paradigm of the day, and
>        did they best they could... and they did a lot of
>        damn hard work to get those trees.
>        But I saw no evidence of testing, no attempts to
>        falsify... and the folks publishing the trees
>        had low confidence in all trees (even there own)...

shoulda been an ichthyologist James.....fish are cool!

>        This SLAYS me.  Are you saying that cladistics is
>        a static, stagnant field that does not adapt and
>        grow and improve itself by adopting more stringent
>        tests when they eventually (and inevitably) are
>        produced?

hmmm? What do you mean by cladistics? Synonymous with systematics? In
that case no, obviously not. Synonoymous with a few simple systematic
notions (distinctions between apomorphy/plesiomorphy,
monophyly/paraphyly, notions of classification using monophyletic
groups, parsimony)? In that case, that core should be stable. Them
good concepts. Innovations would have to be prettydamn spectacular to
upset them, but certainly can add on....
I simply said that I spend a lot of time explaining these simple
principles to folks who (to my enormous amazement) seem basically
unfamiliar with them, despite their involvement with systematics. So
please,,,dont be slayed.

>> the real problem which
>> systematists address,,finding the branching pattern of phylogeny in a
>> framework which draws from all the legitimate sources of evidence

>        This, too is a statistical question.  Are these data
>        legit, or are they noise?

let me understand you better. Do you consider the test of
congruence,,,taken by itself, without all the theoretical doodads
attached, just the simple algorithm,,is this necessarily a
statistical tool? If I have a small matrix, with a small number of
possible trees, and I manually map the data onto all the trees and
choose the one which requires the least number of steps,,,am I doing
statistics? If so, then we might simply have a semantical
difference,,,although obviously there is a hell of a lot of
theoretcial baggage which comes with your semantical decision. If
not, then I would say that I have a non-statistical method which
distinguishes between legit and noise under a criterion (congruence
of homologies) which is pertinent, powerful and unmatched.

>At the heart of it, I think the dispute has been ecentered
>> on the statisticians refusal to acknowledge the legitimacy (and
>> sometimes even the existence) of the "traditional" approaches to
>> systematics (the approaches which still can be said to have taught us
>> just about *everything* we know about phylogeny).
>        Really.  That's going too far, Tom.

you know, you are not the first one to say that to me,,,trouble is,
no matter how far I go, I always look up and there I am,,,,its so
hard to judge these things.....
But seriously, I dont know of any max-like, or NJ, or distance tree
which contradicts a well-supported parsimony tree and is generally
accepted as a more likely (sic) reconstruction. I know you will
object that consensus is not a valid truth criterion, but that was
what I was alluding to when I spoke of what "we know about

>        On the notion of ad-hocness, here's a quandry.  Why,
>        if the max pars tree results in the fewest ad-hoc
>        hypotheses, are the jounral pages FILLED with crap
>        about why the researcher didn't get the tree they
>        expected?  IF they get the wrong tree, they almost
>        invriable make up stories to make the tree make
>        sense... and they are creative, and sometimes
>        interesting... but they are post facto, and ad-hoc.

mahn, you be reading the wrong journals,,,,never seen anything like
that in Cladistics (for instance),,,,

>        I've outlined my positions i think fairly well,
>        but I've never intimated that statistical inference
>        is unbounded.  In fact, I've tried to call for more
>        realism in the representation of statistical methods.

go James!

>> I tried your test on my morpho characters,,

>        Did you read the manual????

but James, its a mac program,,,,i looked at the figures..
(you should get some of those cute little flowers and butterflies
like MacClades got)

>        I responded earlier to this; and the point is
>        that of high logical probability.  You state that
>        you want hypotheses of high probability, and yet
>        insist that when you have a set where p is high,
>        parsimony of all things will critically test
>        these already very reasonable hypotheses.
>        I don't see how it is possible for a
>        set of hypotheses with high logical probability
>        can be afforded the dose of corrorobotation you
>        give them when by definition they have a
>        high logical probability./

Ok, we are tripping up on the high/low probability issues that Popper
managed to confuse a lot of folks on. Homology hypotheses are low
probability statements,,,they are bold assertions which extend a
genrealization over a multitude of phenomena, linking them under the
same word. They are tested and sometimes corroborated in the course
of developing the matrix. They are not maximally bold or minimally
probable statements on the scale that Popper envisioned becasue they
are not universal laws,,,(pass this argument over your understanding
of the difference between classes and individuals, to give it the
proper spin). The congruence test is the last test,,once again, not
the most powerful test imaginable, but a powerful test nonetheless,
and one which suceeds in flunking as many hypotheses as it passes (on
average). So what is the result?  A set of low probability homology
hypotheses, which have been corroborated by all of the tests in our
bag of tricks. They fit together into a hierarchy. We can then say
that it is highly probable that this is the best reconstruction of
the true phylogeny that we can achieve, given our assumptions of the
retrievability of historical information from character analysis.
If this strikes you as strange, consider that e=mc2 was a very bold,
low probability hypothesis, it engendered a host of bizarre
predictions,,,and when some of them were empirically demonstrated,
the hypothesis was corroborated. A corroborated low probability
hypothesis. So what do we say about it now? That is is highly
probable that e=mc2 is our best statement regarding the true nature
of energy and matter. That is all I was trying to say with all my
butler/murder arguments. The mpr is a set of bold hypotheses which
have been endowed with an extra dose of corroboration, making the
overall pattern highly probable.

>        I see the analogy of looking among sets of hypotheses
>        of competition for competition as directly analogous
>        to making a list of hypotheses of homology, each of
>        which has a high probability (that's how they are
>        chosen), and then finding congruence.

ok, I see, you think that because the hypotheses have already
survived some testing that they are no longer very bold by the time
they get to the congruence test. But I dont get what is really
bothering you about this. Is e=mc2 less valuable as a scientific
result, does it have less claim to the title of "best statement about
energy and matter" because some of the edge has been taken off its
boldness by having succesfully survived some testing? Hell, if I put
my homology hypotheses through 100 tests and they survive them all,
will you come to me and say that my results have lost all their
boldness, so I should.....what? disbelieve them? Consider the method
of testing to be invalid? I dont get your point....seems like your
are fighting a strawman notion that the congruence test is all there
is to doing systematics, so it must be maximally powerful.

>         Also, the
>        point you missed it that the ecologist has an alternative...
>        to ask how often one expects to the see the level or
>        degree of competition in a null world where competition
>        does not exist, or is equally probable among all species,
>        or whatever interesting hypothesis s/he is testing.  And
>        they do it.  They don't rely on mere congruence

But those address different things. A test against a null is not
going to indicate the choice of a tree, just the "significance" of
the tree tested. What does this have to do with parsimony?

>> look at any cladogram and you will see plenty of homology
>> hypotheses brutally splattered  all over the place,,,,falsified,,,but
>> not tested, eh? I guess cladistics really is magic!
>        They are not really falsified, are they Tom.
>        They are only falsified for the time being.

well, I live in the time being...

More information about the Taxacom mailing list