warren_lamboy at QMRELAY.MAIL.CORNELL.EDU
Fri Jul 28 15:27:43 CDT 1995
OFFICE MEMO Phylogeny continued Date:27/07/1995
Date: 28/07/1995 13:39
From: Steve Heydon
Steve Heydon writes:
>I also agree that there is a vast difference between 10% completely correct
>trees and 10% correct nodes of a tree. Kim et al. 1993. Evolution 47:
>471-486. have shown that in their simulations about 55% of the nodes are
>correct. As I have tried to indicate in a previous message, even if we know
>that a certain percent of the tree or nodes of the tree are, on average,
>correct, we don't know which ones they are. Such a tree is close to useless,
>in my opinion.
Perhaps this only means that we have a statistical problem of a slightly
different nature. Certainly a node defined by one apomorphic character is
less likely to be "real" than a node defined by three or more characters.
For all its problems, techniques like Bootstrapping do provide such
information. Other statistical techniques comparing phylogenies of the same
group derived with different kinds of data sets would also provide
information on the amount of support for different nodes. Perhaps what we
need is actually more phylogenies.
It should not be forgotten that sciences other than systematics tolerate at
least as much statistical uncertainty as we do. Is anything really proven
statistically in ecology or animal behavior? Even such hard sciences as
physics are built in part on the Heisenburg (sp.?) principle which states
that we can never know exactly both the position and the vectors of
movement of subatomic particles at the same time. (Or something like that.)
There is an excellent chapter on curiosity in a book by Scott Peck called
Further Along the Road Less Travelled which eloquently treats this whole
Being close to the truth is not the same thing as being ignorant.
slheydon at ucdavis.edu
I do not view phylogeny reconstruction as a statistical problem at all. A
phylogeny arose in only one way and only once. It is impossible to validly do
statistical analysis of something that has occurred uniquely in time and
There is nothing statistical about it.
I do not agree that a node that is supported by one character is less likely
real than one supported by three or more characters. I would maintain that we
no rational way to assess how likely any node is to be correct.
Bootstrapping, too, I think does not solve anything. It simply samples from
questionable data over and over again. (I do not question the honesty of the
person(s) collecting and reporting the data, I question whether the data
extractable information relevant to phylogeny reconstruction). If my stepson
attempt to measure the speed of light with two flashlights and a stopwatch, we
increase the precision of our estimate with repeated "experiments", but our
probably won't increase-the method itself is faulty. Similarly adding more
different types of data to an analysis doesn't necessarily increase the
of obtaining the true tree, not if the data is uncorrelated to the
can never know whether it is or not. Adding more phylogenies of questionable
accuracy does not, I think, improve the situation.
If ecology and animal behavior tolerate uncertainty, that is their business,
hardly a justification for systematics doing so.
I must point out that the Heisenburg uncertainty principle is experimentally
verifiable--one can prove by experiment that there is a limit to the precision
estimating both position and momentum simultaneously. No such verifiability
possible in phylogeny reconstruction. We can work as hard as we can to get
phylogeny possible, supported by a multitude of different data sets, and yet
experiment can ever be performed to see if the phylogeny is correct or not--we
cannot (outside of simulation studies) even determine how badly off we are.
can do is examine ancillary criteria internal to the analysis, such as tree
values of the consistency index, etc.
Many brilliant people have spent a good part of their lives developing the
methods of phylogeny reconstruction, and I am sincerely grateful to them for
providing us with such intellectually impressive, beautiful, clever,
and logically constructed methodologies. As theoretical structures, they are
fabulous and awesome. Really rather inspiring, I say.
My problem arises when they are presented as mathematical models of reality or
real historical processes. It is one thing to create a model of reality and
another to show that it is a good description of reality--the fact that I can
test the model to see how well it "fits" is what really makes me pause.
I agree that being close to the truth is better than being ignorant. I would
be ignorant, however, than to believe a falsehood.
More information about the Taxacom