[Taxacom] Markov chain Monte Carlo

Peter Hovenkamp phovenkamp at casema.nl
Thu Apr 19 15:56:51 CDT 2012


  Not correct.
If it appears in 95 % of the sample, it does not appear in 95 % of the 
unsampled population. That would be representative sampling. Could be 
done much easier and quicker than MCMC sampling.

MCMC sampling is different: it samples not according to frequencies, it 
samples according to optimality. If there would be only one tree (or 
item, or whatever) with an optimality of 0.95, it would be sampled 95 
times out of 100. No matter how many other trees (or item, or whatever) 
are present. Of course, the other trees would all have very low 
optimalities.

Mindbending.

Peter Hovenkamp


On 19-4-2012 17:29, Richard Zander wrote:
> Carl:
> I much appreciate the clear short response. I was wrong to use "tree."
> And also "clade" is wrong in place of "tree" since one is not looking
> for the "best" clade but the one that appears most often.
>
> Yes, each tree must be improbable (multiply the posterior probabilities
> of each branch to get the probability of the entire tree).
>
> So when the posterior is given as 0.95, it means that branch
> configuration (clade) appears 0.95 of the time in the sample, which if
> large enough implies that it also appears 0.95 of the time in the
> unsampled population.
>
> Dang.
>
> Many thanks,
> Richard
>
> * * * * * * * * * * * *
> Richard H. Zander
> Missouri Botanical Garden, PO Box 299, St. Louis, MO 63166-0299 USA
> Web sites: http://www.mobot.org/plantscience/resbot/ and
> http://www.mobot.org/plantscience/bfna/bfnamenu.htm
> Modern Evolutionary Systematics Web site:
> http://www.mobot.org/plantscience/resbot/21EvSy.htm
> UPS and FedExpr -  MBG, 4344 Shaw Blvd, St. Louis 63110 USA
>
>
> -----Original Message-----
> From: Carl Rothfels [mailto:crothfels at yahoo.ca]
> Sent: Wednesday, April 18, 2012 5:45 PM
> To: Richard Zander
> Subject: Re: [Taxacom] Markov chain Monte Carlo
> Very short superficial response -- your initial description of MCMC is
> correct -- it attempts to sample from the posterior distribution in
> proportion to its probability density. But you're going astray when you
> start thinking about there being one tree that has a .95 posterior -- in
> any dataset with more than a handful of taxa any single tree will have
> an extremely low posterior (which may be what you're trying to say). In
> fact, with many taxa, the chances of sampling any tree even twice is
> low, even if you take very many samples from the posterior. So that's
> why folks concentrate on elements of the posterior, rather than on
> complete trees -- in what proportion of the sample is a given split in
> the tree supported (the posterior support for that branch)? What's the
> average length of this branch in the posterior? etc. The presence of a
> huge number of unique and extremely unlikely trees doesn't affect these
> measures (not in a way that is not already captured by the MCMC,
> provided the MCMC has sampled the posterior well, as it should if the
> researchers did a good job..).
>
> carl rothfels
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of these methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>





More information about the Taxacom mailing list