Sat Jun 10 12:00:26 CDT 1995

        This is considerably off the requested topic, but is something we
        have been struggling with for some time, and hope someone has
        some ideas.  Our issue is not how to determine confidence, but
        how to maximize it.

        We are using a parataxonomist model to identify and record
        beetles from samples that will be used to quantify environmental
        impacts of various perturbations.  Since the parataxonomist is
        only dealing with morphospecies from a local fauna, this has
        proven practical and cost effective.  Each individual must be
        assigned to a species and recorded.  In this model, a PhD
        systematist is available to the parataxonomist as a resource, and
        provides checks and direction to other resources as the need
        arises.  We are working with over 1,000 species, of which we have
        quantitative data on around 800.  Misidentifications are not
        really that much of a problem in some ways, because in our checks
        of data, they simply lower the predictive value of the species
        they are misplaced into, i.e. only correctly associated
        (determined to be the same) species show up as correlates.

        Here, however, is the problem.  When a species is first seen as
        new, the systematist is consulted.  All member of this species
        are then mounted until a predetermined number are seen, and then
        are checked by the systematist.  If they are all correct, the
        parataxonomist is assumed to know the species (if not, they are
        divided and the process repeated).  After that, there comes a
        sliding scale of confidence by the parataxonomist.  At first, she
        will check every specimen under the scope with a known.  Later,
        she will check the specimen through the glass with the knowns,
        and eventually, she decides that she knows the species on sight.
        The degree of uncertainty that leads to each level of decision is
        weighed against the cost of turning around, pulling a drawer,
        locating the unit tray, getting the specimen under the scope, and
        putting everything away.  We believe that the majority of errors
        occur at the "edge of uncertainty" in this evaluation, i.e. the
        point where she is "pretty sure" and it just doesn't seem worth
        the trouble to check.

        This type of learning and internal cost-benefit analysis is
        probably hard-wired in our brains, and in the main, a good thing.
        Without it, we would never get through the material in any
        reasonable amount of time.  Therefore, we are trying to come up
        with a way of using it while minimizing the "edge of uncertainty"
        type of error.  It is important, because as the level of error
        goes down, the number of species useful as indicators goes up
        (again, we don't find a problem of false positives).  Our
        approach is to try a computer-assisted system of data input, so
        that the "cost" of checking is significantly lowered by the need
        to "pass by" the opportunity to view images and text about the
        species in order to record the data.

        We would be very interested in hearing if anyone else has dealt
        with this issue, and any thoughts on this approach or alternative

        Mike and Donna Ivie
        Department of Entomology
        Montana State University
        Bozeman, MT 59717

More information about the Taxacom mailing list