Weights, probability

James Francis Lyons-Weiler weiler at ERS.UNR.EDU
Tue Mar 4 18:18:44 CST 1997


Doug Yanega commented:

>
> But it sounds like you're making an extremely risky assumption there
> yourself, in claiming that YOU UNDERSTAND THE PROCESSES BEHIND PHYLOGENY
> enough to recognize them in action before you've even attempted to build a
> tree. I, for one, do not feel all that comfortable claiming I (or you)
> understand evolution so well that I can trust a program to tell me when
> evolution has thrown me a curveball. I also still think it sounds
> suspiciously like doing a parsimony analysis without generating a tree -
> which is where most people look for "footprints" in their matrices before
> they try weighting characters.
>
> >       It's perfectly analgous to testing ecological data for
> >       normality prior to the application of paramteric statistics;
> >       the methods assume normality, so one test the data
> >       FIRST.
>
> I know it sounds good to say that this is analogous, but - as Tom has been
> stating - by drawing this analogy you are again apparently treating the
> whole thing as if it's one big probabilistic function right from square one.
> The existence and properties of things like normal and poisson distributions
> - and ways to sample them, etc. - has been demonstrated to everyone's
> satisfaction, certainly, but you can't test for normality (or compare a data
> set to any set of expectations) unless you have *defined* a normal
> distribution to begin with, as a standard of reference. I'm not sure I see
> exactly what the concept of an "expected distribution of character states"
> is being defined *as* - i.e., what standard of reference you are proposing -
> so as to allow you to make this sort of comparison (or if it has any meaning
> at all in the context of phylogenetic reconstruction).
>
> > Phylogenetic methods assume any number of things;
> >       testing the matrices for violations of these assumptions
> >       is as desirable as trying to ensure that what you think
> >       are homologies are; in fact, it's the same problem.
>
> We may actually agree here about it being the same problem - in fact, it
> looks like what you're advocating is basically just plain old character
> weighting cast in a different light, but still basically testing one's
> matrix TWICE - so not only is it the same problem, but the same approach
> people have used in the past, given a different flavor. Let me ask you
> again; if your suggested procedure is a method for "testing for assumption
> violations" and parsimony is a method of "testing for assumption
> violations", what does your procedure accomplish that is so fundamentally
> different from simply by running a parsimony analysis and then reweighting
> characters based on the outcome (i.e., whether one detects homoplasy, long
> branches, or whatever) and running it again? As a possible answer to my own
> question, it simply sounds like you're advocating using an algorithm for
> character weighting based on assumptions about the sorts of patterns
> different evolutionary processes are expected to produce, and just looking
> for those patterns without using a tree-style output. In other words, the
> thing you're trying to do is build a _process model_ into phylogeny, which
> parsimony does not. If so, this is not a new idea, nor do I find that it
> inspires me to trust in those process/pattern assumptions any more than I
> did before.


It doesn't just sound good to say that the a priori test is analogous to
testing for normality; they are epistemological equivalents.  Details
later on RASA, but consider this:  the test for normality is a test of
sufficient condition for the assumptions of methods of parametric
statistical inference; the test does not include assumptions about WHY a
sample may or may not be be normally distrubution; statisticians only know
that when a sample is not normally distributed the tests can be
misleading, and any process that causes the "pattern" of non-normality
will therefore jeopardize the test. The a apriori method I'll delve into
later uses this same principle; when the distributions of character states
has a particular pattern, the data are not suited to the requirements of
tree building algorithms.

On epistemological equivalents, my position that parsimony doesn't test
anything can be further illustrated by analogy.  In a simple regression,
the slope of the line of best fit itself is a parameter (beta).
Parsimony is invoked to determine that parameter (i.e., least squares).
The fact that the line is the line of least fit provides no TEST of the
line, or of the covariation of x and y.  That is, least squares is a
criterion. The value L for a parsimony tree alone does not provide a test
of that tree, nor of the covariation of the character states.  L and beta
are epistemological equivalents.  A test of L would look something like

        L-NullL/Error Term,

or one can use the PTP.

The least squares (parsimony) criterion is used to choose a model slope
(tree); the model itself should be critically tested against a null
slope..  What's lacking is an epistemological equivalent to a null tree,
and a critical test of the model tree...


By the way, whether one explicitly weights or not after the a priori
testing is up to them.  More later on RASA...




More information about the Taxacom mailing list