[Taxacom] What are accurate phylogenies?
Richard Jensen
rjensen at saintmarys.edu
Sun Oct 12 09:46:38 CDT 2008
Hi Bob,
This is a problem of interpretation and intent. I think your final
paragraph is probably the correct way to interpret what is meant by
accuracy in a phylogeny of data from real taxa, as opposed to data from
simulated taxa, no matter what method of simulation is used. Many of us
have our favorite simulated data sets for demonstrating that a
particular method will or won't recover the "correct" phylogeny.
With perhaps few exceptions (e.g., situations in which organisms have
been manipulated artificially to yield new strains [e,g., cultivars of
crops] so we know the "phylogeny" of these strains), all reconstructed
phylogenies are estimates and hypotheses.
Cheers,
Dick J
Richard Jensen, Professor
Department of Biology
Saint Mary’s College
Notre Dame, IN 46556
Tel: 574-284-4674
Bob Mesibov wrote:
> I'm baffled by the use of the words 'estimate' and 'accuracy' by some
> phylogeneticists. Anyone else having this problem?
>
> Both words are commonly used in maths. For example, I might try to
> estimate a number. If my estimate is the same as the number, then my
> estimate is accurate.
>
> In phylogenetics, I often read that a tree built by one method or
> another is an estimate of a phylogeny. I think this means that the tree
> is a guess at both the topology and the branch lengths of the true
> phylogeny. I also read that a particular method is more likely to give
> an accurate estimate of the phylogeny than another method. This is a
> very interesting claim, because it suggests that there is some way to
> know the true phylogeny, so we can compare it to the estimate.
>
> Some authors trace their use of the word 'accuracy' to this paper:
>
> Hillis, D.M. & Bull, J.J. 1993. An empirical test of bootstrapping as a
> method for assessing confidence in phylogenetic analysis. Systematic
> Zoology 42(2): 182-192.
>
> Hillis & Bull here define 'accuracy' as 'the probability that a
> specified group is contained in the true phylogeny' (p. 183).
> 'Probability?' Well, sort of, because they're doing a bootstrap
> analysis, and bootstrapping generates estimates of likelihood.
> Nevertheless, Hillis & Bull state clearly that 'knowledge of the true
> phylogeny' is necessary to test accuracy in their particular use of the
> word (p. 184). In their tests, they used a home-made, purpose-built
> 'true' phylogeny.
>
> Something similar is done in
>
> Woolley, S.M., Posada, D. and Crandall, K.A. 2008. A comparison of
> phylogenetic network methods using computer simulation. PLOS One 3(4):
> 1-12; e1913.
>
> Here the authors use the word 'accuracy' 15 times in 12 pages, but they
> use simulated sequences for their analyses. However, Woolley et al.
> caution that
>
> "While simulations can provide general predictions about the behavior of
> the models studied, as well as some sense of their robustness (insofar
> as differing models are explored in the simulations), it is rarely
> possible to simulate the entire universe of relevant models and the
> models simulated may represent real data only to a given extent.' (p. 2)
>
> So if you know the correct phylogeny in advance, it's possible to find
> after a comparison that one method is more likely to give the correct
> phylogeny than another. Or you might find that one method yields a tree
> which more nearly approaches the correct phylogeny than another, which
> is not the same thing, and involves comparing topologies and branch
> lengths between trees (no fun at all).
>
> But in real-world cases we don't know the phylogeny in advance. When I
> see the word 'accurate' applied to a real-world phylogeny in the
> systematics literature, should I understand that the author(s) are
> reasoning as follows?
>
> "In studies with already-known, synthetic phylogenies, the method we use
> gives an accurate tree. By extrapolation, this method applied to
> real-world data also gives an accurate estimate of the true phylogeny."
>
More information about the Taxacom
mailing list