warren_lamboy at QMRELAY.MAIL.CORNELL.EDU
Fri Jul 28 11:57:59 CDT 1995
Mail*Link(r) SMTP FWD>RE>FWD>True phylogenies
Margaret Thayer wrote:
Date: 28/07/1995 10:20
From: Margaret Thayer
At 07:58 AM 28-7-95 -0400, Warren Lamboy wrote:
>Yes, it is possible to know the true phylogeny, if it is produced by computer
>simulation or generated in the lab. Papers that do this include ....
Such work may provide useful insights into the conditions under which
various methods of phylogeny reconstruction work better or worse, but leaves
a layer of uncertainty at a different level. Generation of the "true"
simulated phylogenies depends on assumptions/hypotheses about how evolution
actually works, and we don't and can't *know* that any more than we can know
true phylogenies of real natural organisms.
Wolfgang Wuster made an excellent point regarding the importance of
distinguishing between 10% accuracy at finding completely correct
phylogenies and 10% accuracy at finding true nodes of phylogenies.
Margaret K. Thayer thayer at fmnh.org (use this form for best results)
Adjunct Assistant Curator
Field Museum of Natural History - Zoology, Insects
Roosevelt Road @ Lake Shore Drive
Chicago IL 60605, USA tel. 312-922-9410, ext. 838 fax 312-663-5397
And I reply:
Yes, Margaret, I agree with your first paragraph, but I would like to add that
methods of phylogeny reconstruction ALSO make assumptions/hypotheses about how
evolution/speciation has occurred, and as you say, ". . . we don't and can't
know that . . .", but why are the phylogeny reconstruction methods not
criticized on the same grounds?
No one seems to fret about the assumptions inherent in phylogeny
reconstruction methods. Possibly this is because it is difficult to state
exactly what they are. Inability to describe them, however, does not mean
they are not present or not operative.
The fact that we can never know the true phylogeny means that we can never
measure the goodness of PR methods using the appropriate yardstick. It is
like devising a new method for quantification of DNA for which no DNA standard
is possible (note I say "is possible"). The quantification method may measure
something, but what is it?
Svante Wold (a chemical engineer/statistician at the University of Umea,
Umea, Sweden) said regarding his associates' willy-nilly use of data to
confirm or deny their hypotheses: "In a specific data set there is often no
information whatsoever about the given problem." That is to say, one of the
major assumptions that is often neglected in approaching any scientific
problem is that one is assuming that the data collected contain extractable
information about the question of interest. I do not think that such
extractable information exists with respect to phylogeny reconstruction, given
that multiple genes control morphological characters, quantitative characters
must be turned into discrete characters for use in most programs, extinctions
of taxa have occurred, there have been phyletic changes, reversals, and
parallelisms, there are different rates of mutation at different loci,
undetectable hybridizations have taken place, the phenomenon of transgression
occurs [individuals in segregating populations fall beyond the parental
phenotypes] see, e.g., deVicente and Tanksley 1993 Genetics 134: 585-596.,
and so on and so forth. Most of these difficulties just mentioned are on top
of the poor accuracy shown by phylogeny reconstruction methods with "clean"
I also agree that there is a vast difference between 10% completely correct
trees and 10% correct nodes of a tree. Kim et al. 1993. Evolution 47:
471-486. have shown that in their simulations about 55% of the nodes are
correct. As I have tried to indicate in a previous message, even if we know
that a certain percent of the tree or nodes of the tree are, on average,
correct, we don't know which ones they are. Such a tree is close to useless,
in my opinion.
More information about the Taxacom