[Taxacom] Why character-tracking doesn't happen?

Mario Blanco mblanco at flmnh.ufl.edu
Sun Sep 14 20:38:04 CDT 2008


Well, using total evidence does not mean that you should simply throw 
all your data into the analysis even if you don't know what is 
homologous to what.  If  the homology of a particular character (either 
morphological or molecular) is uncertain, then it is better to exclude 
it from the analysis. Hopefully the rest of the characters will provide 
sufficient phylogenetic signal. And you can still run other analyses 
with the problematic character in alternative positions to compare 
results; but if you have too many bases of ambiguous alignment, then the 
number of additional analyses necessary can become too large to be 
practical.

Quantitative-continuous data (e.g., many measurements) are particularly 
difficult to code into discrete character states (especially in sessile 
organisms that show extensive phenotypic plasticity, like plants), and 
are usually left out of cladistic analyses. Different ways of dealing 
with quantitative-continuous data have been proposed, but as far as I 
know there is no agreed-upon best solution. This is admittedly a big 
limitation of current cladistic methods.

-------- Original Message --------
Subject: 	Re: [Taxacom] Why character-tracking doesn't happen?
Date: 	Mon, 15 Sep 2008 09:17:12 +1000
From: 	Bob Mesibov <mesibov at southcom.com.au>
To: 	mblanco at flmnh.ufl.edu
CC: 	TAXACOM <taxacom at mailman.nhm.ku.edu>



Gotta run, but:

Mario Blanco wrote:

"This is analogous to the situation in which you score morphological
data 
(e.g., color of petals), and then you hava a species with no petals.  
For this species, the character state here becomes "missing", and you 
effectively create a gap in the matrix (and indels are simply gaps in
an 
aligned matrix)."

Sorry I wasn't clear enough. Look at the three aligned sequences below.
The characters analysed are the numbered columns, not the nucleotide
positions. The 5 character states are A,G,C,T and -. No problem.

123456789
AATGATATA
AGT--GCTA
AGTGACGTA

Now watch. Imagine sequence 3 was never found. The alignment is now

123456789
AATGATATA
AGTGCTAxx

See what happened? The 4 nucleotides at 'positions' 6-9 in the second
sequence are now at 'positions' 4-7. They've shifted from character to
character, and their positions in the native sequence are definitely not
themselves characters.

I admit this is a contrived example, but I'm pretty sure this often
happens with indels in multiple sequence alignments. Column characters
and their states change depending on which sequences are analysed.

Some molecular phylogeny gurus suggest that uncertain sections of
alignments shouldn't be used in an analysis, just as some morphological
phylogeny gurus suggest that uncertain homologies should be omitted.
Don't know where that fits with the 'total evidence' philosophy...
-- 
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery and
School of Zoology, University of Tasmania
Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
(03) 64371195; 61 3 64371195
http://www.qvmag.tas.gov.au/mesibov.html







More information about the Taxacom mailing list