Human and ape phylogeny

B.J.Tindall bti at DSMZ.DE
Sun Apr 6 12:11:21 CDT 2003

The problem is the way terms have come to be used:

Cain and Harrison associate the term "phenetic" in the sense of "natural"
as discussed by Gilmour - it is a way of handling data (which may be
phenotypic or genotypic). It is difficult to see how a data set can be
phenetic. It may be handled in a phenetic way (overall similarity). They
also discussed patristic and phyletic relationships.

Harrison came to realise that people confused phenetic = phenotypic
analogous to genetic = genotypic (this is not what they meant).

Cladistic was coined by Huxley to contrast with gradistic, but didn't
really define cladistic. This term has come to be associated with Hennigian
methodology. However, you also find the term phylogenetic being used.

It should be remembered that a phenetic approach may also give us the
phylogeny if parallelism or convergent evolution (for example) doesn't play
a significant role. Remember that in an earlier e-mail we were also told
that the "phylogenetic" evaluation of sequence data would be falsified if
there was gene transfer taking place. My impression is that a phenetic
approach has a built in assumption about parsimony of evolution, which, of
course causes problems when there is extensive homoplasy.

Some gene or protein sequence alignments (Clustal for example uses UPGMA)
make use of phenetic methods in order to create an alignment. Remember also
that the program may assume positional homology, but it is the person who
puts in the data who is telling the programme that they are comparing
homologous (orthologous) sequences - put in analogous sequences which show
some degree of similarity and you still get a dendrogram when you generate
the "tree".

I have seen a website in which phylogeny is defined as "being based on
sequence data" and taxonomy "being based on phenotypic data". Sorry I can't
recall the address. I confess I find the way the same terms are used by
different people to be rather confusing.


At 10:07 5.4.2003 +1200, David Orlovich wrote:
>I think you're mixing two meanings for the term "phenetic".  On the one
>hand, "phenetic" could be used to mean "of the phenotype" or something
>like that - in which case measuring a phenetic character would simply
>mean measuring something of an organism's phenotype - basically any
>character you measure, for whatever purpose, is phenetic.  I think even
>DNA sequence data is phenetic in the sense that it is something
>specific to the organism that you measure.
>On the other hand, "phenetic" is used to mean certain methods of
>analysis based on overall similarity, some of which I think were
>largely discredited as a set of non-phylogenetic methods - isn't this
>partly what the "cladistics wars" were about?
>It means that using the word "phenetic" is politically incorrrect and
>associated with a non-cladistic way of thinking, but in reality (well,
>my reality!) all characters are phenetic - it's that way they are
>analysed that makes the difference.
>You imply that one can do a cladistic analysis in two ways - either
>with polarised characters (i.e. only chosing synapomorphies) or with
>phenetic data (i.e. without any pre-selection for only 'good'
>characters).  I don't think many people explicitly do cladistic
>analyses the first way any more.  To me, it is circular reasoning to
>polarise characters before doing a phylogenetic analysis.  Constructing
>a most parsimonious tree, and rooting it, defines the direction of
>evolution and sorts out the synapomorphies from the plesiomorphies and
>parallelisms - it is an inherent and intrinsic part of the tree
>construction.  Thus, in doing a cladistic analysis, one is testing the
>homology of the characters that were measured and defining which ones
>are synapomorphies and which ones are homoplasy  - we make an
>hypothesis of homology by measuring a character in different taxa and
>we test this hypothesis by doing a cladistic analysis.  The more
>non-homologous characters I put into a cladistic analysis the worse the
>tree will be - reflected in a lower consistency index (i.e. more
>homoplasy (noise)) in the tree.  It could be seen as subjective to
>select characters that will not be homoplasious to get a more workable
>phylogenetic hypothesis, but some people even embrace this idea and use
>techniques like successive weighting (not without critics) to weight
>characters with high consistency indices (from prior cladistic
>analyses) in subsequent analyses - thus generating a tree that is
>better supported by the data.  Another way to get around the problem of
>only 'wanting' non-homoplasious characters is to choose a LOT of
>characters in the hope that the homoplasies will be swamped by the
>synapomorphies and result in a robust tree.  It's funny that this is
>exactly the same way to make a phenetic 'tree' (say a UPGMA dendrogram)
>approach a phylogeny - and the UPGMA dendrogram has the advantage that
>it is rooted!
>The apparent circularity of the above scenario can be avoided in
>another way - by generating the phylogeny from a data set that is not
>the same as the morphological data in which you wish to test homology.
>I see this as the main scientific reason for doing phylogenetics on DNA
>sequence data.  One can generate an hypothesis of phylogeny (either by
>cladistics or other phylogenetic methods like maximum likelihood) and
>then test the homology of the morphological characters by mapping them
>on to the molecular phylogeny.  It breaks the link of the characters
>being used to generate the phylogeny being the same characters on which
>to test homology.
>DNA sequence data is in the same boat as morphological data when it
>comes to generating an hypothesis of homology to test by a phylogenetic
>analysis.  The advantage is that is can often be less subjective - the
>hypothesis of homology is generated by an alignment algorithm (I think
>usually simply a pairwise cluster analysis) where there are only 4 or 5
>possible character states - and ever character is more or less the
>same.  Taking into account codon positions, secondary structure and
>transition/transversion ratio where possible makes for an even better
>hypothesis of homology.
>So ... where does this leave your problem:
>> "No a priori
>> judgements were made as to the primitive or derived condition of
>> characters". My reading of that statement is that the characters were
>> not
>> 'cladistic' in the sense of each standing as proposed synapomorphies.
>> Instead they were phenetic.
>I think it's not an issue - in doing their phylogenetic analysis, they
>are hypothesising that the characters they measured will be
>synapomorphies for particular clades in the tree, and they are testing
>the hypothesis by doing the cladistic analysis.  Rooting the tree will
>polarise the characters and indicate which character state changes are
>synapomorphies and which are plesiomorphy or convergence.  In saying
>that the authors made "No a priori [judgments] ... as to the primitive
>or derived condition of characters" I take this to mean that they allow
>the cladistic analysis and rooting to determine the direction of
>evolution.  As far as cladistic vs phenetic is concerned, all
>characters are phenetic - it all in the analysis.
>You also asked:
>> Am I correct to view this paper as a 'cladistic' analysis of phenetic
>> characters with an arbitrary rooting of one of the taxa being
>> analyzed[?]
>Arbitrary?  The ingroup is assumed to be monophyletic with respect to
>the outgroup.  Did the authors meet this criterion?  An issue could
>arise where the outgroup taxa/taxon are/is on a long branch (i.e. not
>much in common with the ingroup) then many of the character state
>changes could be autapomorphic for the outgroup, thus creating
>uncertainty in the position of attachment to the rest of the tree (the
>same problem applies to ingroup taxa of course).  Where there are lots
>of ingroup taxa, this might only affect the relative position of a few
>of the more 'basal' ingroup taxa, whereas the clades more distant from
>the outgroup will be relatively unaffected by the position of the
>outgroup (I think this is why midpoint rooting is considered to be a
>valid alternative - could be wrong though).
>David Orlovich.

* Dr.B.J.Tindall      E-MAIL bti at                           *
* DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH *
* Mascheroder Weg 1b, D-38124 Braunschweig, Germany                *
* Tel.: ++ 531 2616 0 (general)                                    *
* Tel.: ++ 531 2616 224 (direct)                                   *
* Fax:  ++ 531 2616 418                                            *
*                                                                  *
* Homepage:                          *
* E-MAIL: contact at (general enquiries)                      *
*         sales at (sales)                                    *

More information about the Taxacom mailing list