[Taxacom] Why character-tracking doesn't happen?
Mario Blanco
mblanco at flmnh.ufl.edu
Sun Sep 14 20:38:04 CDT 2008
Well, using total evidence does not mean that you should simply throw
all your data into the analysis even if you don't know what is
homologous to what. If the homology of a particular character (either
morphological or molecular) is uncertain, then it is better to exclude
it from the analysis. Hopefully the rest of the characters will provide
sufficient phylogenetic signal. And you can still run other analyses
with the problematic character in alternative positions to compare
results; but if you have too many bases of ambiguous alignment, then the
number of additional analyses necessary can become too large to be
practical.
Quantitative-continuous data (e.g., many measurements) are particularly
difficult to code into discrete character states (especially in sessile
organisms that show extensive phenotypic plasticity, like plants), and
are usually left out of cladistic analyses. Different ways of dealing
with quantitative-continuous data have been proposed, but as far as I
know there is no agreed-upon best solution. This is admittedly a big
limitation of current cladistic methods.
-------- Original Message --------
Subject: Re: [Taxacom] Why character-tracking doesn't happen?
Date: Mon, 15 Sep 2008 09:17:12 +1000
From: Bob Mesibov <mesibov at southcom.com.au>
To: mblanco at flmnh.ufl.edu
CC: TAXACOM <taxacom at mailman.nhm.ku.edu>
Gotta run, but:
Mario Blanco wrote:
"This is analogous to the situation in which you score morphological
data
(e.g., color of petals), and then you hava a species with no petals.
For this species, the character state here becomes "missing", and you
effectively create a gap in the matrix (and indels are simply gaps in
an
aligned matrix)."
Sorry I wasn't clear enough. Look at the three aligned sequences below.
The characters analysed are the numbered columns, not the nucleotide
positions. The 5 character states are A,G,C,T and -. No problem.
123456789
AATGATATA
AGT--GCTA
AGTGACGTA
Now watch. Imagine sequence 3 was never found. The alignment is now
123456789
AATGATATA
AGTGCTAxx
See what happened? The 4 nucleotides at 'positions' 6-9 in the second
sequence are now at 'positions' 4-7. They've shifted from character to
character, and their positions in the native sequence are definitely not
themselves characters.
I admit this is a contrived example, but I'm pretty sure this often
happens with indels in multiple sequence alignments. Column characters
and their states change depending on which sequences are analysed.
Some molecular phylogeny gurus suggest that uncertain sections of
alignments shouldn't be used in an analysis, just as some morphological
phylogeny gurus suggest that uncertain homologies should be omitted.
Don't know where that fits with the 'total evidence' philosophy...
--
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery and
School of Zoology, University of Tasmania
Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
(03) 64371195; 61 3 64371195
http://www.qvmag.tas.gov.au/mesibov.html
More information about the Taxacom
mailing list