[Taxacom] FW: cladistic analysis for morphological characters -- UPGMA is not cladistics

Dan Lahr dlahr at ib.usp.br
Sat Nov 17 08:42:51 CST 2012


I agree with Dick, that is quite an unusual claim. Similarity methods are
the fastest.  If something is so large that it would actually "take too
long" to run UPGMA, then it would probably be impossible to analyze under
ML or Bayesian frameworks, and likely parsimony as well.  Just seems like
the person confused methods.  I have seen instances where people have used
NJ or UPGMA for seriously large datasets, but these are usually
bioinformatics people asking specific questions, they know the pitfalls (or
should).  It may also be common practice among genetics labs if they are
looking at very recently diverging entities (pathogenic bacterial lineages
perhaps) where using probabilistic models would make very little difference
-- lineages have not had time to saturate different substitution sites, but
you always have the issue of convergent hits.

More interestingly though, you raise an issue that has always bothered me:
people using multiple reconstructions methods.  I'll start by saying I do
not have a strong stance nor the answers to this question.

In principle, it seems philosophically incoherent -- you should choose a
strategy and stick with it, many of the different analytical methods are
logically incompatible.  It is also naive to think that in some situations
different methods would yield different results, provided you do things
correctly.  And if they end up yielding distinct results, what do you do
with them?  For instance, the biggest strength and at the same time
Achilles heel of Bayesian methods is that you can define "priors", which
are basically distributions of values that you believe, a priori, should
contain your answer.  To simplify, if you were trying to estimate which
exact spot an object would hit after you release it, your prior could say
that the general direction it should move is the floor, because we are
standing on Earth with a gravitational effect.  However, most people define
flat priors (meaning they don't believe a priori that any result is more
likely than another), which in practice this makes the analyses equivalent
to ML (this is my interpretation, I may be wrong).  I have done this myself
many times (the power of editors) but I feel like we need to be more
explicit in our logical choices.

cheers,

Dan


On Thu, Nov 15, 2012 at 1:51 PM, Ashley Nicholas <Nicholasa at ukzn.ac.za>wrote:

>  Thanks for the insight Dan,
>
>
>
> Sadly I do know of some labs that still routinely using phenetics
> particularly Nearest Neighbor. When I queried this, I was told by a
> colleague  that UPGMA, although the preferred method, took too much
> computation time!!!! Which I found disturbing given that science is all
> about maximizing and not minimizing the uncertainty of results. I must say
> though that these labs also routinely run Maximum Likelihood and Bayesian
> Analyses using the same gene sequences. I guess they do this to assess if
> there is any congruence in the results of all three analytical methods?
>
>
>
> Regards
>
> Ashley
>
> ---------------------------------------------------
> Ashley Nicholas (PhD)
> Associate Professor & Curator Ward Herbarium
> School of Life Science,  Westville Campus
>
> University of KwaZulu-Natal,
>
> Private Bag X54001,
> Durban, 4000, South Africa
> Tel.:+27-31-260 7719 Fax.: +27-31-260 2029
>
>
> http://lifesciences.ukzn.ac.za/Staff/Biodiversity/biodiv_evo_staff/Durban/nicholasa.aspx
> nicholasa at ukzn.ac.za
>
> ----------------------------------------------------
>
> Empirical scientists do not deal with the truth, we deal with hypotheses.
> At their best these hypotheses are insightful and predictive, however,
> nonetheless experience has shown that they are often only a poor
> approximation of reality and therefor the truth. – Ashley Nicholas
>
> —-------------------------------------------------------------------
>
>
>
> *From:* daniel.lahr at gmail.com [mailto:daniel.lahr at gmail.com] *On Behalf
> Of *Dan Lahr
> *Sent:* 14 November 2012 19:46
> *To:* Ashley Nicholas
> *Cc:* taxacom at mailman.nhm.ku.edu
>
> *Subject:* Re: [Taxacom] FW: cladistic analysis for morphological
> characters -- UPGMA is not cladistics
>
>
>
> Hi Ashley,
>
>
>
> "Analogous (rather than homologous) base pair sequences are probably less
> common than in morphology -- so maybe molecular systematists can get away
> with approximating it to an evolutionary tree."
>
>
>
> It is most certainly not.  There are only 4 options for a site (ATCG), 5
> if you count indels, and convergent hits are highly likely.  Molecular
> systematists or anyone that cares about molecular evolution cannot get away
> with that, and that is why probabilistic methods are used instead of
> similarity methods.  These assume a model of evolution to infer how many
> substitutions actually took place in a specific site, even though what you
> may see in modern terminals are the same base pairs for a given site.  One
> of the trickiest parts is to flesh apart homology from convergence in
> molecular sequences, and a lot of money and intellect is poured into that
> very issue.
>
>
>
> I am not following this thread, so not sure if this is what you meant, in
> which case I apologize. I just wanted to clarify that molecular
> systematists do NOT rely on similarity methods of historical
> reconstruction.  If you do see such a case in a paper newer than 1990´s,
> then it is a glaring omission from the reviewers and editor.
>
>
>
> Kind regards,
>
>
>
> Dan
>
>
>
> On Tue, Nov 13, 2012 at 10:01 AM, Ashley Nicholas <Nicholasa at ukzn.ac.za>
> wrote:
>
> John you are right,
>
> UPGMA is a phenetics method and is not eplicitly evolutionary. It only
> measures similarity, and similarity is not always a good indicator of
> descent from a common ancestor. This is especially true in flowering plants
> where convergent evolution/homoplasy is rife.
>
> Analogous (rather than homologous) base pair sequences are probably less
> common than in morphology -- so maybe molecular systematists can get away
> with approximating it to an evolutionary tree. However, in the end it is
> not an explicit evolutionary tree -- and this needs to be acknowledged
> rather than ignored (which is what usually happens). However, no matter
> what, the resulting phenogram is a hypothesis. This hypothesis is as valid
> as any other hypothesis (until falsified) -- and probably carries some
> interesting insightes and may generate some interesting questions for
> further explorations.
>
> The text books say a minum of 60 characters is needed but I would think
> the number of characters needed would depend on the size of the group being
> analysed. Some statistician has probably established this??
>
> Regards
> Ashley
>
> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:
> taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of John Grehan
> Sent: 10 November 2012 17:08
> To: Sami Rabei
> Cc: TAXACOM
> Subject: Re: [Taxacom] cladistic analysis for morphological characters
>
> In my opinion its ok to make a cladistic analysis for any number of
> characters. It just depends where those characters are clustered within the
> group analyzed as to the result. I suspect that unless the characters are
> dispersed throughout the 44 species, there will be some clades that have
> some measure of good support, others that do not, and others that whose
> relationships are unresolved.
>
> I'm a bit out of touch with all methods, but I recall UPGMA is a phenetic
> method?
>
> John Grehan
>
> On Sat, Nov 10, 2012 at 6:45 AM, Sami Rabei <samirabei at mans.edu.eg> wrote:
>
> > Dear All
> >
> > I have 81 morphological characters for 44 species. it is right to make
> > a cladistic analysis for them. If it is ok which program I can use
> > it.On the other hand I did UPGMA .
> >
> > Many Thanks in advance
> >
> > All the best.
> >
> > Sami Rabei
> >
> > http://mansoura.academia.edu/SamiRabei
> >
> > ----------------------------------
> > With my Best Wishes
> > Sami Hussein Rabei, Ph.D.
> > Botany Department
> > Faculty of Science,
> > Damietta University
> > New Damietta , Post Box 34517
> > Damietta
> > Egypt .
> >
> > Tel. Mobile:   002 0127 3601618
> > Tel. Work:     002 057 2403981
> > Tel. Home:    002 057 2403108
> > Fax:              002 057 2403868
> >
> > _______________________________________________
> >
> > Taxacom Mailing List
> > Taxacom at mailman.nhm.ku.edu
> > http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> >
> > The Taxacom archive going back to 1992 may be searched with either of
> > these methods:
> >
> > (1) by visiting http://taxacom.markmail.org
> >
> > (2) a Google search specified as:  site:
> > mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> >
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as:  site:
> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> ======= Please find our Email Disclaimer here-->:
> http://www.ukzn.ac.za/disclaimer =======
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as:  site:
> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>
>
>
>
>
> --
> ___________________
> Daniel J. G. Lahr, PhD
> Dept of Zoology, Univ. of Sao Paulo, Brazil
>  ======= Please find our Email Disclaimer here-->:
> http://www.ukzn.ac.za/disclaimer =======
>



-- 
___________________
Daniel J. G. Lahr, PhD
Dept of Zoology, Univ. of Sao Paulo, Brazil



More information about the Taxacom mailing list