[Taxacom] BioNames and others names

Richard Pyle deepreef at bishopmuseum.org
Sun Jun 2 13:28:35 CDT 2013


> Given that what most databases (the 'aggregators') appear to be doing is
> beyond my comprehension, this is unsurprising. This is caused by the
> 'database mentality', and does not mean that there is a genuine problem
...

Part of it is "database mentality"; part of it is how the ICNafp view
differs from the ICZN view; and part of it is genuinely different data
needs.  The real problem is that the term "taxon name" is legitimately used
in different ways in different contexts.  For example, if your goal is to
discover appearances of taxonomic names within, say, large volumes of
scanned/OCR'd literature, then what you need is a comprehensive index of
every text string that has ever been purported to represent the scientific
name of an organism.  Examples are uBio and GNI -- where the count is now up
to something like 22M unique text strings.  It's very natural to think of
these different unique text strings as "taxon names".  At the other extreme
is GNUB, which tracks unique "name objects".  In that context, each unique
"name" is a protonym, and was created in a single usage instance.  Binomens
and trinomens are just different collections of these names, used to
represent a terminal epithet in a classification.

I made a very strong push a number of years ago when LinneanCore/TCS was
being developed to forbid the unqualified use of "name" or "taxon name", and
always provide a qualification to express what one means when one refers to
a "name".  We even started a glossary, that eventually ended up here:
http://wiki.tdwg.org/twiki/bin/view/UBIF/LinneanCoreDefinitions 

But, alas, the confusion and ambiguity continue today.

> ***
> Well, when it comes to "scientific names of taxa", this comes to eight
(with
> perhaps no more than three types being involved). If you want to talk
about
> 'names' you can make it as complicated as you like.

Yes, exactly.  And because different people (both taxonomists and database
developers) have different perceptions of what a "name" is -- some guided by
the Codes, some guided by the specific needs of the informatics project,
etc. -- we end up with a plethora of units all described as "taxon name".
It's very easy to blame this on "database mentality".  However, I think it's
a bit more subtle than that.  Before computer databases, there was no need
for a precise definition of what a "taxon name" is.  All the communication
was between human - human, and like all conversations between humans, a
reasonable understanding can emerge even though both parties have slightly
different interpretations of the words.  In other words, sloppy terminology
does not impede communication so much.

However, computers and computer databases deal with much higher precision.
This can be both a strength and a weakness, but usually it's a strength.  In
any case, computers need a more precise definition of what a "name" is,
because that determines whether you create a new record (with a new
identifier) for something, vs. add a new property to an existing record.

For example, in one context, "Pseudanthias ventralis", "P. ventralis", and
"Pseudanthias ventralus" all need to be tracked separately, so all three
would get a distinct database row.  In another context, all of these
represent lexical variants of the "same" name, so there would be one record
in the database, with a structure that tracks alternate spellings of the
"same" name.  You can't blame the database for this, because there are
different use-cases that require different informatics needs, and these
differences are legitimate.  A different perspective might see two names
("Pseudanthias ventralis" and "Pseudanthias ventralus"), where the
abbreviated genus does not need to be tracked separately.  Yet another
perspective might also see two names, but as protonyms ("Pseudanthias" and
"ventralis").

> >> But under the ICNafp there is no confusion on these points, so these
> >> are strictly localized sources of confusion.
> >
> > Too bad that botanical names represent such a small fraction of the
> > taxa that are out there....
> 
> ***
> Don't blame me. Not my doing ...

I'm not blaming you for that.  But your attitude seemed to be "Well, the
ICNafp has a clear definition for what a taxon name is, so Rich is a
candidate for 'Most Gross Exaggeration of the Year'".  My counter-point to
that is that ICNafp is just one of several Codes, and the Codes in general
represent only part of the informatics needs for broader
taxonomic/biological communication.  This is why our broader community (not
just botanists who are focused on what a particular Code says) has a problem
of meaning many different things by the term "taxon name". 

Aloha,
Rich





More information about the Taxacom mailing list