Status of plant systematics
Richard Zander
bryo at COMMTECH.NET
Mon Sep 8 08:42:54 CDT 1997
This thread was started by some who worried that classical systematics
was unraveling. Of course a house-cleaning and thorough re-examination
is always good for a field, but perhaps cladistics needs some
unraveling, too.
I'd hoped for more feedback on my concept that the posterior
probabilities of maximum likelihood hypotheses (this includes maximum
parsimony to a significan degree) are too low for the hypotheses to be
scientifically valuable. So far I've been able to refute (to my
satisfaction) the few refutations, and I've gotten some support by
private email from mavericks. Where are cladists when you *need* one?
I'll try again: Simplicity works in other sciences because improbable
"least wrong" hypotheses can be tested immediately, and systematics has
no such tests. The result of maximum parsimony and maximum likelihood
analysis does not form a "probabilistic reconstruction through a
discovery process" but instead groups taxa by presumed advanced
characters through a clustering process that works on the basis of
distance from a hypothetical shared ancestor. The latter is an advance;
the former is pseudoscience.
Maximum likelihood is easy to understand if you start first with
Bayes' Theorem, which is given in any textbook on probability. Most
likelihood papers use log likelihood equations based (so I read) on a
differential (if data are treated as continuous random variables, the
maximum likelihood is the line with slope zero that is tangent to the
probability density curve).
Using Bayes' Theorem allows finger-and-toe maximum likelihood
computations:
You have a confederate who rolls two dice, one of which has four sides
(a tetrahedron) and the other has six (a cube). They are numbered 1 to 4
and 1 to 6, respectively, and are assumed to be not loaded and fairly
cast (these are necessary Bayesian regularity assumptions). Your
confederate throws the two dice randomly in secret and announces when
one turns up (or down in case of the tetrahedron) a "1" (you can also
use any other number shared by the two dice).
The initial probability of guessing which die was thrown is 1/2, but
with the additional information of the data (datum) set "1", the
probabilities of guessing which die produced the "1" change for you. The
hypothesis (die) of maximum likelihood is the tetrahedron, since it is
more likely to generate a "1" (1 out of 4 ways) than the cube (1 out of
6 ways), and is proportional (1:1 here) to the initial probabilities of
that particular die. Using Bayes' Theorem, the (posterior) probability
of guessing the tetrahedron as being the die that generated the "1" is
0.6. This is a better chance of being correct than random, certainly.
Mau et al.'s cpDNA restriction-site study of 9 species of Clarkia got
this same posterior probability (.6) for the tree of maximum likelihood:
http://www.stat.wisc.edu/~newton/papers/abstracts/tr961a.html
which is not great but there is at least more evidence for than against,
assuming the regularity assumptions are correct.
Mau et al's mitochondrial DNA study of cichlid fish (same paper) found
marginalized posterior probabilities of .11, .07, .06, .04, .04, and .03
for the five likeliest trees. The tree of maximum likelihood (the one
with .11) must be interpreted as that tree that would prove correct (the
true tree) in 1 out of 10 occurrences of that exact same data set. The
true tree is almost surely (9 out of 10) not the tree of maximum
likelihood.
This is almost exactly the same probabilistic bet as the case where
your confederate rolled randomly one tetradedric die mixed with 15 cubic
dice until one of them turned up a "1", then you betting it was the
tetrahedron that generated the "1". Although the tetrahedron has
greatest likelihood of generating a "1" than any one of the cubic dice,
it was probably one of the cubic dice that generated the "1".
Suggesting that the tree of .11 probability was the best hypothesis in
the cichlid fish study is reminiscent of the story of the gambler in a
casino, who, when asked how his luck was holding out, replied, "I'm
doing fine. I haven't won in two hours, but the fellow sitting next to
me hasn't won in four hours."
--
*******************************************************
Richard H. Zander, Buffalo Museum of Science
1020 Humboldt Pkwy, Buffalo, NY 14211 USA bryo at commtech.net
*******************************************************
More information about the Taxacom
mailing list