Weights

Tom DiBenedetto tdib at UMICH.EDU
Thu Mar 6 19:43:14 CST 1997


On Wed, 5 Mar 1997 13:18:04 -0600, Stuart G. Poss wrote:

>.......I am left wondering, if a statistical paradigm for
>character evaulation is inappropriate, then under what circumstances do
>other methods fail?

What makes you think that a statistical paradigm for character
evaluation has any power to judge failure? Failure of a phylogenetic
hypothesis would equate to inaccurately reconstructing a phylogeny;
the inference would be different than the historical reality.  You
seem to equate "failure" with arriving at results that differ from
the expectations of generaliztions drawn from different and
independant circumstances. Such results are not in the least bit
falsified merely because they are unexpected given previous
experience.

>Typically, in science one must be concerned with hypothesis testing
>rather than "evidence accumulation", although in practice it may, at
>times, be difficult to distinguish them.  Nonetheless, phylogenetic
>testing procedures that can not fail under any circumstances are not
>particularly useful scientifically

Huh? Sounds like all the wonderful Popperian arguments focussed on
the wrong level. It is *hypotheses* which are useless and
uninteresting if they cannot fail under any circumstances.
Personally, I would love to have a methodology which couldn't fail. A
testing procedure which couldn't fail would be great! How could you
imagine that such a thing would not be useful?

> A theory of methodology that
>predicts everything (or nothing specific) is not informative

Except that "theories of methodology" dont predict anything, it is
the hypotheses generated in those methodologies which make
predictions that may or may not prove informative.

>  Likewise, cladistic characters that are
>ALWAYS regarded as "good" are not particularly informative.  It is only
>when they conflict with other cladistic characters that they provide
>evidence of evolution/phylogeny.

What cladistic character has ever been presumed always "good"? They
are all
thrown into the soup to sink or swim. And since when do we need
characters to conflict in order to garner phylogenetic evidence?
Homologies are expected by theory to be congruent.

> If some cladistic characters behave as
>if they are "better" than other cladistic characters (more robust to
>congruency or compatible when compared to others), then is it not
>reasonable to suppose that such "hehaviors" may be in general different
>among characters and that these differences will generate alternate
>distributions of potential outcomes (trees)?

No, it is not reasonable at all. You are erecting classes of
characters,,linking fundamentally independant events which are not
necessarily operating by any underlying common causal factor (we are
concerned with evolutionary events, not simple chemical reactions),
and ascribing to them some regularity with which you presume to
structure all future phylogenetic discovery. No thank you. What
regularities referring to congruence would you expect from such a
disparate class of phenomena, and what makes you think it would be so
regular and predictable that we should restrict our future
understanding of phylogeny to the constraints of this "knowledge"?

> If such distributions exists, then can not statistical or quasi-statistical methods be
>employed to study such distributions?  If so, would not such study be
>informative with respect to phylogeny?

Possibly,,,,equally likely to lead one astray.

>Some have argued that the answer should be no, because we are speaking
>of evolutionary events, which either happened or did not happen and
>that, as such, can not be modeled statistically. If I understand him
>correctly, Tom DiBenedetto makes this argument, by stating,
>
>"if we were to know the true answer, we would be able to go back and
>reweight our characters such that we could run the matrix and reproduce
>the right answer. What would those weights be? In all cases, either 0 or
>1, for if we knew what happened, we would know that a hypothesized
>transformation either happened or didnt.  The probabilites say nothing
>about the reality of the transformation; they are statements about what
>we predict was likely to happen, given reference to some knowledge we
>think may be relevant."
>
>However, even assuming a Bayesian outlook, such an argument is specious,
>since under no circumstances yet available to science would a scientist
>be in a position to directly "know the true answer" for any cladistic
>character (unless of course we are talking about the most recent of
>events).

Well obviously, my argument would be specious if it were intended to
portray a possible situation. It was presented however, in response
to an argument that equal weighting could lead to wrong answers,
hence we should introduce probability factors, as these would somehow
reliably guide us to the truth. My point was that the "true" weight
of our hypotheses are actually 1 or 0; thus a method which erects
logical tests for corroboration or refutation (choose 1 provisionally
or 0), is as legitimate (more so in my mind) than a method which
quantifies some intermediate value based on independant, hence
fundamentlally irrelevant experiences and determines results on some
summation function of these probabilities.

> Our knowledge is inferential, or to put it bluntly in this
>context: there is no certainty that we have not included at least some
>potentially misleading characters into our analysis (whatever it happens
>to be).  Consequently, how we weight our characters (how independent we
>think them to be) is important to the appropriateness to the inferences
>we make.

Sorry, but I just dont see how the last sentence follows necessarily
from what went before. You seem to claim that apriori weighting will
immunize us from misleading characters,,,but if you know which
characters are misleading in the first place, why include them? But
the deeper question is, how do you know this? Because of trends in
what happened elsewhere? Why do you expect the trend to be
consistent?

>Indeed, the size of the pool of potentially misleading
>characters is so much larger than the set of potentially non-misleading
>(compatible) ones, that in all probability, any set of cladistic
>characters will likely contain at least a few and hence, will almost
>always likely be at least partially wrong.  This is easy to demonstrate,
>as all but the most artificial datasets have numerous incompatible
>characters not all of which can logically be true simultaenously.

Hence a congruence test.

>One could argue that failing to appreciate the importance of
>establishing relative weights that should be placed on characters, even
>if dealt with only as included or excluded, denies us the opportunity to
>test some of the central tenants of Darwinism, as well as investigate
>some of the most interesting questions in biology relating to the
>genetic independence of morphologiccal features.

How is that? How does an apriori weighting aid us in investigating
the genetic independence of morphological features?

>Studying relative weights is closely tied to the biology of studying
>characters and taxa, both of which are highly proabilistic in nature.
>Indeed Darwin's theory predicts that taxa should not have (any/too
>many?) neutral characters, that under some circumstances some will
>replace others, and that evolutionary transformations are not all
>equally likely, owing largely to probabilistic consequences of natural
>selection.  Presumably, such selection occurs even at the very moment
>evolutionary "accidents" took place or at least sometime afterward.

How do you see the "probabilistic consequences of natural selection"
translating into a reliable set of probability scores to be imposed
apriori on characters in such a way that one can increase the
accuracy of phylogeny reconstruction?

>I would find it highly ironic that methods to infer phylogeny should be
>devoid of probabilistic reasoning, when phylogeny itself is largely the
>result of natural selection, a very highly probabilistic process.

If natural selection is a highly probable process, than one might use
with profit a probabilistic approach to study it. I am not studying
natural selection. I imagine that I am to some extent, studying some
of the results of n.s., but the phenomena I study are factual,
observable characteristics of organisms. I dont really care how they
got there (at least for now). I am concerned with discovering
patterns in the distribution of these characters. It is the patterns
which lead to the inference of process; it has always been that
way,,including the very inference that evolution occurred. Empirical
patterns are the test of process hypotheses; to structure your
percpetion of pattern by the predictions of your process hypotheses
brings the whole enterprise to a grinding halt,,or squirrels it away
into some imploding spiral.

>Are we to conclude that the processes primarily responsible for phylogeny
>are of no consequence to or independent of the very models that are
>meant to infer the outcome of these processes?  If so, how will we know
>when our methods fail?

Are you claiming that by including measures of
probability,,themselves highly hypothetical, that you are seizing on
some method which will tell you whether you are absolutely correct or
not? Do you think that by biasing your phylogenies so that they tend
toward returning findings that verify your expectations regarding
process, that you have some better handle on assessing when your
method is failing?




More information about the Taxacom mailing list