confidence
MICHAEL A. IVIE
ueymi at MSU.OSCS.MONTANA.EDU
Sat Jun 10 12:00:26 CDT 1995
This is considerably off the requested topic, but is something we
have been struggling with for some time, and hope someone has
some ideas. Our issue is not how to determine confidence, but
how to maximize it.
We are using a parataxonomist model to identify and record
beetles from samples that will be used to quantify environmental
impacts of various perturbations. Since the parataxonomist is
only dealing with morphospecies from a local fauna, this has
proven practical and cost effective. Each individual must be
assigned to a species and recorded. In this model, a PhD
systematist is available to the parataxonomist as a resource, and
provides checks and direction to other resources as the need
arises. We are working with over 1,000 species, of which we have
quantitative data on around 800. Misidentifications are not
really that much of a problem in some ways, because in our checks
of data, they simply lower the predictive value of the species
they are misplaced into, i.e. only correctly associated
(determined to be the same) species show up as correlates.
Here, however, is the problem. When a species is first seen as
new, the systematist is consulted. All member of this species
are then mounted until a predetermined number are seen, and then
are checked by the systematist. If they are all correct, the
parataxonomist is assumed to know the species (if not, they are
divided and the process repeated). After that, there comes a
sliding scale of confidence by the parataxonomist. At first, she
will check every specimen under the scope with a known. Later,
she will check the specimen through the glass with the knowns,
and eventually, she decides that she knows the species on sight.
The degree of uncertainty that leads to each level of decision is
weighed against the cost of turning around, pulling a drawer,
locating the unit tray, getting the specimen under the scope, and
putting everything away. We believe that the majority of errors
occur at the "edge of uncertainty" in this evaluation, i.e. the
point where she is "pretty sure" and it just doesn't seem worth
the trouble to check.
This type of learning and internal cost-benefit analysis is
probably hard-wired in our brains, and in the main, a good thing.
Without it, we would never get through the material in any
reasonable amount of time. Therefore, we are trying to come up
with a way of using it while minimizing the "edge of uncertainty"
type of error. It is important, because as the level of error
goes down, the number of species useful as indicators goes up
(again, we don't find a problem of false positives). Our
approach is to try a computer-assisted system of data input, so
that the "cost" of checking is significantly lowered by the need
to "pass by" the opportunity to view images and text about the
species in order to record the data.
We would be very interested in hearing if anyone else has dealt
with this issue, and any thoughts on this approach or alternative
views.
Mike and Donna Ivie
Department of Entomology
Montana State University
Bozeman, MT 59717
More information about the Taxacom
mailing list