Who is the postivist?

Richard Zander bryo at COMMTECH.NET
Wed Dec 10 17:03:17 CST 1997

Ted Schultz wrote:
> >Richard Zander responded to my recent posting with:
> >
> >Wonderful! I would like to see statistical phylogeneticists deal more
> >knowledgeable with priors in such a way that we get better results. I do
> >think, however, that they already are being lenient in assuming a
> >neutralist selection position for all genes, as well as a number of
> >other (somewhat outrageous) assumptions.
> There are a number of problematic assumptions that go into the phylogenetic
> analysis of DNA sequences but, the last time I looked, neutral selection
> was not one of them.  Perhaps you are suggesting that parallel selection on
> genes in distantly related species will cause their sequences to converge.


> If so, this could cause problems; however, I am not at all sure that
> presuming that this has NOT happened is more "outrageous" than presuming
> that it HAS happened.

Well, maybe I exaggerated, but I did get you to contribute to the
thread, for which, thanks: I learn something from almost every post. Of
course it has happened. Convergence is found (by various means) in both
molecular and morphological data. Okay, one can assume it has not
happened given no evidence that it has, but this is a big assumption,
even given CI indexes of ca. .85 for what are called well-supported
molecular trees.

> And it seems fairly reasonable to me that we would
> not expect the same directional selection in multiple genes, or in multiple
> genes + multiple morphological characters, so to the extent that
> phylogenies are constructed from diverse data sources we are compensating
> for this problem.

I agree fully.

> >>I said:
> >> Two other reasons that blunt Zander's critique are:
> >>
> >> 1. An entire tree is not really a single hypothesis.  Instead, it is the
> >> conjunction of multiple hypotheses of monophyletic groups.  Some of these
> >> monophyletic groups may appear in many of the suboptimal trees surrounding
> >> the optimal (most likely/most parsimonious) tree, i.e., they may be very
> >> well supported and thus highly "probable."  Judging the probability of
> >> subtrees rather than trees is certainly more fair to phylogenetic
> >> methodology.
> >Zander responded:
> >Okay. Bremer support is fine. How much Bremer support is needed to make
> >a subclade a probabilistic reconstruction of phylogeny? Now, I am not
> >talking about obvious relationships like ((man chimp) dog). That's a
> >nicely parsimonious subclade. It is the fine structure of big trees,
> >dealing with very similar taxa differing by simple characters that I
> >question.
> I am not just talking about Bremer support and, indeed, Bremer support
> cannot be applied to likelihood trees.

Not as such. One might view, however, subclades that are identical in
all trees with posterior probabilities that add to .5 or to, say, .95,
as the equivalent (the first for more evidence for than against, the
second to maybe actually call it a good estimation). Perhaps this would
not work, what do you think?

> Any number of ways have been
> proposed for calculating "confidence limits" around subtopologies have been
> suggested (e.g., the Kishino-Hasegawa parametric test, Page's median trees,
> parametric and non-parametric bootstraps).

Interesting...can you supply a short bibliography? This seems to be a
critical area we should all know more about.

> I am curious about your
> criterion that makes ((man chimp) dog) "obvious."  Whatever that criterion
> is, you should be able to apply it to those "very similar taxa" as well.
> If they don't pass the test, then more character data are needed.

The criterion is that there is a reference set of intermediate taxa
between the man chimp and the dog. Thus it makes parsimonious sense that
macroevolution or lots of convergence were very improbable in this case,
since gradual evolution is modeled. With very similar taxa, there are no
intermediates. Thus, in the case of three similar taxa, two of which
share one (or two or three) advanced but simple traits, convergence
(perhaps involving more than one trait) necessarily must be considered
as not just possible, but somewhat probable.

> >
> >>I also said:
> >> . . . When trees or subtrees are framed as a
> >> priori null hypotheses, and when they are subsequently corroborated because
> >> they appear in the optimal tree or in the "confidence-set" of
> >> optimal+suboptimal trees in some specified confidence interval, then we
> >> have failed to reject these null groupings and, again, a probability is
> >> conferred upon them that is greater than what is implied by Zander above.
> >Zander:
> >They are not corroborated. Coincidence may be due to convergence among
> >daughter lines. There is no independent test. Failing to reject a null
> >hypothesis confers ... what? A probability higher than something else?
> >Failing to reject a null hypothesis just means it could be true.
> In modern statistics, we attempt to reject a null (with reference to one or
> more alternatives) and report the results of that attempt as a "P" value.
> If the P value is high, and if all possibilities are covered by the
> alternative hypotheses, then the data have indeed conferred a high
> probability to the null (or, conversely, a low probability to the
> alternative(s)).  In Bayesian terms, the framing of a null corresponds to
> conferring higher a priori probability to one set of (sub)trees over
> another.  If I predict, based on a morphological phylogeny, that a
> particular group of species is monophyletic and if I then find that a DNA
> sequence phylogeny corroborates that prediction, I have indeed increased my
> confidence in that null.

The frequentist statistical explanation is copasetic. So is the
Bayesian, but we must remember that we are assigning probabilities to
one event that happened in the past, one throw of the dice (worse, a
concatenated series of throws of the dice). So here we go with
statistical relevance again. An increase in probability does not
necessarily confer on a theory more evidence for than against. In
medicine, an increase in probability of disease is cause for concern but
I for one would like to see *real good* confidence in a null, not just
an increase in probability. How is your confidence measured?

> You seem to be saying there is an additional
> alternative hypothesis that needs to be accounted for, that of
> convergence/parallelism.  I would not disagree with that, but I would
> suggest that the obvious test is to construct phylogenies from multiple
> character systems.

By this I think you mean reconstruct them (not artificial phylogenies).
I think you can get a nice idea of a small range of possible trees for
certain data sets, but a reconstruction as such seems totally
improbable. Okay, let's settle for a good estimate, an approximation.
Why do researchers then present single trees? Why don't cladists present
just the resolution allowed by Bremer support of at least 1 or two
steps? Why don't statistical phylogeneticists present a consensus of all
trees adding to .95 or so?

> If, however, you are saying that your NULL hypothesis
> is that all character states shared in common between members of a
> "pseudoclade" are due to universal, unrelenting convergence in every
> observed and unobserved character system, then I am not sure how we would
> proceed to test such a null: can you suggest a way?

No I don't say that. Not universal, but unrelenting, yes.

> >
> >>Then I said:
> >> In non-statistical terms, impugning phylogenetics because complex trees
> >> consisting of many taxa are rarely entirely "true" or "false" ignores the
> >> fact that phylogeneticists have discovered and continue to discover real,
> >> highly corroborated monophyletic groups.
> >Zander:
> >No, no. Complex trees usually include disparate taxa that you don't need
> >a computer to arrange in a reasonable tree vis-a-vis a shared ancestor
> >with some outgroup. Phylogeneticists have not "discovered" these trees.
> >They are guesses based on like produces like (mostly). It is when there
> >are lots of alternative trees (optimal+suboptimal as above) that
> >parsimony methods fail and should be impugned most vigorously.
> >
> I don't think it's fair NOT to give phylogeneticisits credit for anything
> you consider obvious, but then to impugn them for the cases where the
> character data are problematic.  As a first step in extending our knowledge
> of evolutionary history, we need to tackle the difficult cases with those
> methods that seemed to work in tackling the more "obvious" ones.  If you
> are saying that phylogeneticists should not attempt to make the difficult
> cases seem more trustworthy than they really are, then you should be
> heartened by the increasing use of tests, Bremer supports, bootstraps,
> multiple character systems, etc.

I am, I am!

> The researchers I know care a great deal
> about discovering the phylogenies of the groups they work on.  They don't
> want shaky answers: they want to be sure.  Personally, I require lots of
> character support from lots of different character systems.  When
> completely unrelated systems keep telling me the same answer over and over,
> I start to think that maybe there's phylogenetic signal coming through.  It
> certainly can't be due to chance, and it strains my credulity to think that
> such congruence (nuclear genes, mitochondrial genes, adult characters,
> larval characters, etc.) could be due to convergence/parallelism as well.

I must not be reading the right journals. Would you kindly cite me one
or two papers that come up with a tree that is not obvious but which
shows how massive congruence of lots of data from different systems has
solved a problem in phylogeny? This is where we should be traveling, but
along the way I see too many claims of having got there already.

I appreciate your clarity and evident expertise, Ted.

> ___________________________________
> Ted Schultz, Research Entomologist
> Department of Entomology, MRC 165
> National Museum of Natural History
> Smithsonian Institution
> Washington, DC 20560
> U.S.A.
> schultz at onyx.si.edu
> Phone (voice and fax): 202-357-1311


Richard H. Zander, Buffalo Museum of Science
1020 Humboldt Pkwy, Buffalo, NY 14211 USA bryo at commtech.net

More information about the Taxacom mailing list