More on the 'cladistics' of sequences
pierre.deleporte at UNIV-RENNES1.FR
Thu Jun 10 16:10:37 CDT 2004
A 21:51 09/06/2004 -1100,John Grehan wrote :
>I am not sure about the need for several outgroups
You need not several outgroups. It's simply better, because if all
outgroups don't root in the same place into your ingroup, then this is an
indication that you have made a mistake. Your analysis tells you that all
your putative outgroups cannot be "out" altogether. See PAUP's manual, and
also Farris 1972 as hinted off list by Jan de Laet.
> if one chooses a sufficiently broad single outgroup.
By "single", you mean a monophyletic group, or a possibly paraphyletic
arrangement of taxa?
>Thus for the orangutan-human synapomorphies the context I am looking at
A "context", or a "single group", and what do you mean by "group"?
>is ALL other primate species collectively and that is quite a lot of species.
Indeed, but likely not a monophyletic group, rather a paraphyletic
arrangement, thus simply a series of "out" species or groups of species,
hence you have not a single monophyletic outgroup but a lot of primate
groups putatively outside your ingroup.
But I won't blame you to use as many taxa as possible, in and out. The
bigger the better.
Note that you can do that with molecules: always use as much relevant
evidence as available.
> Most of the characters stand up pretty well in that regard,
"Stand up"???! Trying to figure out your method from your other posts I now
presume that you mean "their state in the outgroups is uniform"?
>and even those of lesser distribution
Ha-ha ! Interesting... Hence, some of your characters have not a uniform
character state in all putatively outgroup taxa? Hence your method would
finally be the standard cladistic one as implemented in current programs?
But in this case, why do you reject these programs by calling them
"phenetic"?... Very, very puzzling indeed...
> may be supportable (e.g. lack of ischial callosities which is unique to
> orangutans and humans among Old World monkeys and the apes could be
> reasonably treated as an apomorphy rather than as a plesiomorphy
> inherited all the way from the split with New World monkeys which lack
> the callosities).
You now are describing what all cladistic programs do!!! Astonishing... Do
you really know what the programs do? I must assume that you simply don't
know (or you inadvertently forgot). But you reject them?
> This is just an observation, not necessarily a criticism of using
> several outgroups.
Your "observation" consists in describing your method, and I must
aknowledge that your method is the one implemented by the programs. I can't
figure out the slightest reason why you reject these programs...
I can take it another way:
your "callosities" character is a "phenetic" character according to your
highly personal use of the term "phenetic" (see your previous posts): it
doesn't have a uniform character state in all outgroups. It's sometimes
present, sometimes absent, how can you know the plesiomorphic state for
sure? But you decide to use it anyway, and optimize the whole topology for
all characters: certainly according to information gathered from other
characters, you thus accept two changes for this particular character
instead of the minimum possible of one change, according to the slightly
less parsimonious scenario (and corresponding topologies): absent
-->present --> absent.
The method you are implementing is exactly what the programs do. They do it
for you. They root on outgroups, and prefer the optimal overall topology in
case of ambiguity.
The fact that you are apparently "computing" everything in your brain
instead of using these so convenient programs changes nothing to the logics.
And you persist in rejecting these programs... Fascinating...
>Similarly, in examining the phylogenetic relationships of a single genus
>of ghost moths comprising about 12 species I am using the entire family as
Hence, once again, plenty of outgroups indeed (species, monophyletic groups
of species, including their possible internal polymorphism for some
characters I presume... don't tell me you overlook this possible complexity).
>The family comprises 500 species and while I have not looked at every one
>I have at least endeavored to look at most, and eventually all, genera.
This makes a lot of outgroups. Do they all fit unambiguously outside your
ingoup, i.e. are connected by a single branch with your ingroup? If not,
you have likely made a mistake and some outgroup species may be members of
the ingroup in fact. this is the interest of the "multiple outgroup" approach.
Unless you force the analysis to provide you with only one rooting, i.e.
you boil down yourself, by hand (...by brain...) all these outgroups to a
single ideal taxon with a unique series of character states for all
characters. If yes, then you are implementing the "hypothetical ancestor"
Nothing new, this is a classic, but long abandoned because of its too heavy
burden of arbitrariness (you "invent" an ideal taxon fitting your guesses
instead of simply dealing with the taxa at hand), but still possible with
the programs (just introduce this fictitious taxon as "the" outgroup).
>One thing I have noticed said about morphological synapomorphies is that
>they are either difficult to determine and/or that there is a lot of
The latter is certainly more the case of molecular sequence data (limited
range of possible states).
> I wonder whether the former is a product of degree of familiarity and/or
> the ability to generalize a structure (something that I have found can be
> a real challenge to understanding or recognizing comparability),
Homology decisions are easier for sequences when the alignment is non
ambiguous (roughly: few and sparse changes in the sequences, so that
changing sites are embedded in a non-ambiguous context of homologous
features. This is like a change in a bone when the contiguous skeleton is
identical... classic "connexion criterion" for molecules like for
morphology... once again. Molecules have form, you know...).
>and the latter to the use of too many marginal characters (in the quest
>for large numbers the inclusion of features that might be assumed to be
>comparable rather than demonstrated).
Quest for "large numbers", or simply quest for using all relevant evidence?
Now, reliability is not "written on the data". Think twice before throwing
away "garbage". Particularly when you throw away all molecular data.
>Again these are just observations from a personal point of view and I am
>entirely open to thinking about these matters quite differently from
>people who have undoubtedly many more years of detailed experience than I.
This is great news. But not at all a question of experience in my view. I'd
say rather a question of logics, and of really going and fetching a couple
of nice elementary notions about what the programs really do, and possibly
being eager to try and refute one's personal views, rather than only eager
to pretect them. Basic cladistic courses are free, and chewing on them is
CNRS UMR 6552 - Station Biologique de Paimpont
F-35380 Paimpont FRANCE
Téléphone : 02 99 61 81 66
Télécopie : 02 99 61 81 88
More information about the Taxacom