Robert K. Colwell
COLWELL at UCONNVM.UCONN.EDU
Thu Jun 22 10:55:10 CDT 1995
On the usefulness of ranges for metric characters (see earlier discussion
below):
Although ranges for metric characters may be preferable to averages,
they have their own perils. For any statistical distribution (even
the uniform distribution), range is inexorably correlated with sample
size, of course, so that a range with no indication of sample size is
not much more useful than an average. I wonder how many junior synonyms
have their origin in a specimen whose measurements lay outside the a
range given in a published description that was based on a few
specimens...
If the range and sample size do appear in a published description or
key, it seems to me one could crudely estimate the standard deviation
using rankits (e.g. Table 27 in Rohlf and Sokal's stat tables). If the
only the range appears, you get the estimate by looking up the expected
SD for the most extreme values (the published range values), given the
sample size. For example, if the range was publised as "femur 1.4-1.6 mm,"
for a sample of 10 specimens, the each range limit (as an expectation)
lies 1.539 SD from the mean. Estimating the mean as the average of the
range limits (1.5 mm in this example), this means that 0.1 mm equals
about 1.539 SD units, yielding a crude estimate of the sample SD of
0.1/1.5639 = 0.065 mm. If the sample mean is publised as well, you
can get two estimates of sample SD by using the two deviations: abs(
range limit minus sample mean).
(Warning! This untried method has not been approved by a card-carrying
statistician, which I am not. I suggest it for discussion.)
With an estimate of the SD of the character in hand, and a sample of
specimens that you wish to identify, one can then see just how likely
it is that the new specimens fall within the expected distribution
for the character, or judge just how useful size contrasts are for
distinguising two species in a key.
But, Gee! Wouldn't it be nicer if authors of descriptions and keys
just published the actual sample SD in the first place (along with
sample size and range)?
--Rob Colwell
