Warren Lamboy warren_lamboy at QMRELAY.MAIL.CORNELL.EDU
Mon Jun 12 11:39:42 CDT 1995

                       Subject:                               Time:11:12
  OFFICE MEMO          Confidence                             Date:11/06/1995

In my opinion, a good way to quantify one's confidence in a determination is
by comparison to known standards, that is, by statistical comparison to
specimens whose identities are known.  There are a number of ways to do this,
but one way, with which I am familiar is that of disjoint principal component
analysis (Systematic Botany 1990, Vol.15:3-12) which was first developed by
Dr. Svante Wold, of the University of Umea, Umea, Sweden.

Briefly, one creates a separate and distinct mathematical model for each of
the taxa to which an unknown specimen might possibly belong.  Then one fits
the unknown specimen to each model in turn.  In the best of circumstances,
(from the point of view of making a determination), the specimen fits only one
of the models, and is significantly different from all the others.  Sometimes
a specimen will fit more than one model, which most commonly suggests either
that:  a) the specimen is a hybrid between two or more of the taxa in the
study, or b) the models describing the known taxa themselves do not
sufficiently separate taxa, i.e., the models for two or more of the known taxa
overlap to too great an extent.  [The latter situation can be corrected by
collecting data on additional characters and known specimens so that models
can be constructed that separate the taxa adequately.]  Finally, the specimen
may not fit any of the models!  In that case, the specimen may again be a
hybrid, causing it to fall into a region in the principal components space
that is not part of any model, or it may be an extreme or unusual form of one
of the known taxa, or it may be something entirely new.  In any case, disjoint
principal components analysis enables the worker to assign statistical
significance levels to a determination.  It also quantifies those instances
where the data collected or observed is inadequate for accurate determination.
 This is particularly important when one is asked to identify incomplete or
improperly preserved materials.  The paper cited above presents a simple
example of the use of the method.  Another more extensive use of this method
may be found in a paper in Castanea 1992, vol. 57: 52-65.

More information about the Taxacom mailing list