names vs. "names" (was: Names for BioDiv Informatics)

Richard Pyle deepreef at BISHOPMUSEUM.ORG
Wed Feb 9 12:35:16 CST 2005

> > The reason most people feel that the overall benefits of the
> internet and
> the correct information that it gives us access to outweigh the
> costs of the
> bogus information, is that most people have little difficulty
> separating the
> two.
> ***
> If only that were true.

If only what were true?  That most people have little difficulty separating
good web info from bad, or that most people feel that the overall benefits
of the  internet and the correct information that it gives us access to
outweigh the costs of the bogus information?

Do you feel that the benefits of the internet outweigh the costs?

> ***
> If only that were true. My experience (at least so far) is that usually
> (exceptions excepted) it is very difficult to get a database compiler to
> make any correction, or to attach information.
> * * *

It's certainly not true yet, but none of this stuff really exists yet.  What
the uBio folks (and the TCS folks and the LinneanCore folks and the GBIF
folks and the TDWG folks) are trying to do is develop the system that serves
the functions that I, and David, and Mike and Martin, and others have been
trying to articluate.  The puepose of these emails is to discuss a vision,
and discuss how to implement that vision.  If it already existed the way we
wanted to, there would be less need to write long emails to Taxacom about
it.  But it doesn't exist yet, and the database compilers you alluded to are
among the strongest advocates of creating it so that it does exist.  It
can't suddenly exist spontaneously.  It has to be created -- painstakingly.

There are two realms:  the infrastructure realm, and the content realm.  The
infrastructure consists of the data standards and mechanisms of information
exchange, integration, and presentation, so that one need not be a computer
database whiz to be able to surgically extract the information of interest
from the content realm.  The content realm currently exists as a patchwork
of databases, each developed independantly of the others, each with greater
or lesser amounts of validation, verification, scrutiny, and focused
purpose.  The infrastructure and the content need to be developed
concurrently, and optimally would be developed in specific response to user
needs.  Forums like Taxacom are ideal for discussing and debating the nature
of the taxonomic community's specific needs.

> > The goal of organism name indexing is NOT simply to accumulate
> the largest
> collection of names.  The goal is to stop the perpetuation and propagation
> of errors.  Without indexing the bogus names and identifying them as such,
> the future world is at risk of more perpetuation and propagation of such.
> ***
> Yes, but the second step (identifying bogus names) is by far the most work
> and tends not to get done.
> * * *

Speaking as a taxonomist, I couldn't agree with you more!!!  But speaking as
a database developer, I know very well the power of electronic technology
and the internet for making much of the work of taxonomists a LOT less
time-consuming.  When I began my PhD research on the fish family
Pomacanthidae, months (years?) of time was spent trying to locate original
descriptions of old names, trying to figure out what each publication meant
by each use of each name, locating type specimens, etc.  Now, with the
integrated database I developed on my laptop, I can get all of this
information -- including original descriptions as PDF files, images of type
specimens, complete lists of usages of each name over time (etc., etc.) in
seconds (not months or years).  Yes, it took a lot of work to get it all in
there, but more than half of that work (a lot more than half) was on
infrastructure development.

One of the wonderful things about electronic information technology is that
once one person puts all that effort into developing the infrastructure,
it's available for everyone else to use (assuming an open-source paradigm).
Once one taxonomist identifies the original description of a taxon name, and
creates a complete bibliographic index of all subsequent uses of that name,
no one else ever has to repeat the same work.  Bill Eshmeyer devoted several
decades of his professional life to creating the electronic Catalog of
fishes, and the ichthyological world is a MUCH, MUCH better place because of

O.K., I'm preaching now; so I should shut up.

> > It is our job as
> taxonomists to assist the non-taxonomic world in accessing
> information about
> organisms by providing them with (mostly) unique identifers
> (names), so that
> they can communicate with greater efficiency. I believe that the next step
> in fulfilling this job is to sort out the wheat from the chaffe in
> nomenclature, and do it in a way that any given correction needs
> to be made
> only ONCE, and thenceforth forever known by all future users of biological
> names.
> ***
> Yes, quite. I would like to see that happen.
> * * *

Then, as is the case with many discussions on Taxacom, we ultimately agree
on the fundamental issues -- we're just separated by a common language.

Again, I apologise to you (and the list) for my abrasive and preachy tone
today.  I'm still trying to wipe the sleep from my eyes, and ballet practice
is only two hours away (my daughter, not me -- I am just the taxi driver),
so I should return to my work of...errr...compiling taxonomic databases...


Richard L. Pyle, PhD
Natural Sciences Database Coordinator, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at

