More on the 'cladistics' of sequences
pierre.deleporte at UNIV-RENNES1.FR
Mon Jun 7 13:58:47 CDT 2004
A 18:34 06/06/2004 -1100, John Grehan wrote :
>I did not say character states are determined in the absence of knowledge
>of phylogenetic relationships! To the contrary, I am always referring to
>evaluation of each character with respect to an outgroup before the analsis.
And you now know that this is exactly what programs do when they perform
cladistic analysis of molecular data like for any other data. I know that
you know that. Via the outgroup criterion, each and every molecular
character state (i.e. a given base at a given site in aligned sequences)
has an a priori polarization in putative plesio-apomorphy (for each and
every character = site).
And you also know that there is nothing like "rooting after the analysis",
or "rooting during the analysis", because you know that one can perform the
analysis the following way and get exactly the same result:
- first, root optimally all possible topologies according to the data and
- second, pick out the optimal topology (maximizing homology).
You'll get exactly the same result than with applying the following procedure:
- first, pick out the optimal unrooted topology (maximizing homology)
- second, root it optimally using the outgroup criterion.
And you also know that the program begins with discarding cladistically
non-informative characters. This is the first thing it does. Thus, only
cladistically putatively informative characters remain in the analysis,
i.e. characters with putative plesiomorphic state and apomorphic ones. The
data matix is effectively reduced to these characters, and they are not
treated phenetically (i.e. grouping otaxa on the basis of overall
similarity). And only apomorphies play a role for polarizing the tree :
this is what optimal outgroup rooting does.
Hence, all outgroup-polarized molecular characters are "cladistic" in your
(very peculiar) acception of the term, i.e. they are individually,
putatively polarized a priori, via the outgroup criterion, and the analysis
is performed the classic cladistic way for molecules just like for
morphology. Same criteria, same procedures.
Of course Hennig did not use a computer, and did not apply systematically
(via an algorithm) his "auxiliary principle" of preferring interpretation
in terms of homology rather than homoplasy (what is implemented through the
"congruence criterion" for choosing the optimal unrooted topology).
But this doesn't make modern cladistics "non-hennigian" in this respect,
and molecularists "know" their ingroups and outgroups just like the
morphologists do, no more no less, and they face the same problems in this
respect (possible problem of multiple rooting for multiple outgroups, whose
solution consist in enlarging the phylogenetic scope of the analysis and
using more data), and I still cannot understand why you persist in taxing
molecular cladistic phylogeny of being non-cladistic.
>One can document each character for the outgroup and ingroup. By this
>documentation it is possible for each character to be independently
>verified or refuted by another individual
This you can do, exactly this, with molecular data as treated by modern
programs. Just try it, as I suggested you repeatedly. But apparently you
don't try... Why don't you try and verify by yourself that this is all the
same approach? Same logic giving same result?
I admit that the fact that everybody tells you the same thing will not
change your mind the slightest way, for it's quite imaginable that the
whole community of specialists of morphological and molecular cladistic
analysis on earth is wrong and you are right. Science is not a democraty.
But why don't you try and verify? Because it's also imaginable that you are
wrong, and this you can check by yourself:
- take some molecular data
- root them a priori character by character via the outgroup criterion
- throw away putatively cladistically non-informative characters (obvious
autapomorphies and unchanging characters)
- find the optimal tree your own cladistic way
- try with the program PAUP using the same assumptions (costs of changes...)
- check if you get the same result.
>No, but if one cannot polarize the characters and determine which are
>potential synapomorphies before the analysis then the implication is that
>such individuals do not know their group very well.
But this is exactly what the program does, and of course the program itself
does not even knows whether the data you feed it are morphological or
molecular. How could molecular cladistics be different from morphological
cladistics ? Any molecular analysist can tell a priori what are the
potential apomorphies: just like for morphology, they are the character
states not present in the outgroup(s). You said "potential": this is the
rigth term (or "putative"), because some putative plesiomorphies in the
outgroup(s) may finally appear as autapomoprphies of this outgroup. But
this doesn't change anything to the optimal rooting of the optimal topology
for a given data set.
If you're not convinced, just try it, once again, and check.
>I would start with some critical evaluation before the analysis to restrict
>the data set to potential synapomorphies.
All characters with more than one state have potential plesio-apomorphy
polarity. The outgroup provides the support for polarizing. And the program
makes no use of cladistically potentially non-informative characters. So,
where is the problem?
>Then the evaluation after the
>analysis can take place with respect to one's initial determination.
This is what contemporaneous computer-assisted cladistic analysis, followed
by secondary checking of optimal scenarios for characters, is all about.
Once more, where is there any problem with molecular data?
CNRS UMR 6552 - Station Biologique de Paimpont
F-35380 Paimpont FRANCE
Téléphone : 02 99 61 81 66
Télécopie : 02 99 61 81 88
More information about the Taxacom