# Weights, probability

Doug Yanega dyanega at DENR1.IGIS.UIUC.EDU
Tue Mar 4 09:07:14 CST 1997

```Richard Zander commented:

>The heavy discussion on weighting characters has tended finally to light
>upon phylogenetic differences in character importance. Now, I figure
>that, since we don't know the mutation rates of state changes in the
>different character we use, we pretty much depend on Keynes' "princicple
>of indifference" which says if we don't know the different importances
>of various variable, use equal importance since there is a good chance
>it will average out. This is pooh-poohed by experts in probability with
>good reason, so what have we other than a system of classification with
>good Darwinian intentions ("like produces like") and a whole lot of
>argument about the relationships of details for which we have no
>information. Method alone guarantees precision not truth.

Let's see if I can anticipate James' response...he is not talking about
state transformations, but the overall pattern of character states that have
been assigned in one's matrix, and an _a priori_ analysis (i.e. before one
applies parsimony to the matrix) that ostensibly will reveal whether there
is structure in the pattern or not, and which characters are more/less in
keeping with this overall structure. He said:

"Sometimes some of the pairwise comparisons (e.g., A in
taxon i vs. A in taxon j) represent true homologies, while
some of the pairwise comparisons (for the SAME character)
do not (e.g., A in taxon i vs. A in taxon Z), due to
homoplasy.  Not every character is equally informative
owing to this feature, and the degree to which a character
is weighted equally belies and ignores this entirely.

Because a proportion of the among-taxon comparisons (REMINDER:
I am discussing comparisons, NOT transformations, which are
inferences,not observations) are homologous, and some are not,
fairly simple math provides the actual proportion (and therefore
realized probability) of a state comparison being informative.
What we don't know by looking at the differences is WHICH are
informative, and we can't _know_ them by cladistic parsimony
alone, but I propose that we can determine at least a relative
determination of which comparisons are misleading, and accurately
identify which comparisons are not, and thereby IMPROVE upon the
performance of cladistic parsimony and other tree-selection
criteria."

What I don't quite understand is how this a priori pattern-testing is going
to be fundamentally different from a circular application of parsimony
itself; it's like building a tree, then concluding "Aha! Character 33
appears to have a homoplasious state 1 in taxa 4 and 17, so character 33 is
only 85% informative! Now let's plug that probability in with the others and
run the data again." The only difference is that there is no actual tree
constructed in that first pattern-testing step, just an algorithm (without a
graphic representation) which gives you estimated probabilities of
"informativeness" based on some overall comparison of each character to the
others in the matrix. I find the concept of making a "relative determination
of which comparisons are misleading" *before* an analysis to be hard to
swallow, and my gut feeling is that it is equivalent to parsimony analysis
in itself (i.e., looking for incongruence in a pattern).
Just trying to make some sense of this,
Sincerely,
Dr. Douglas Yanega
Depto. Biologia Geral
Univ. Federal de Minas Gerais
Caixa Postal 486
30161-970 Belo Horizonte, MG, BRAZIL
(031)-448-1223

```