[Taxacom] Chameleons, GBIF, and the Red List

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Fri Aug 22 18:37:41 CDT 2014

And what happens when an old name is now split into two or more taxa? For example, "cryptic species". What value/status does data have which is associated with pre-split concepts? A name, Aus bus, could refer either to a species complex before a split. or to one and only one of the cryptic species after the split. What if some modern workers reject the split?


On Sat, 23/8/14, Richard Pyle <deepreef at bishopmuseum.org> wrote:

 Subject: Re: [Taxacom] Chameleons, GBIF, and the Red List
 To: "'TAXACOM'" <taxacom at mailman.nhm.ku.edu>
 Received: Saturday, 23 August, 2014, 5:06 AM
 I have not followed this
 thread closely, but it seems to me that the main problems
 people complain about regarding data harvested by
 aggregators like GBIF fall into two broad categories:
 1) The indicated geographic location is bad
 2) The indicated taxon is bad
 Bad geography comes in two
 basic forms:
 a) The stated geographic place
 is not correct.  This could be due to bad original data or
 bad digitization, but there is generally no way to fix this
 other than fixing it at the source.
 b) The stated geographic place is correct, but
 the associated lat/long coordinates are either missing or
 wrong.  This one could be improved through various
 georeferencing algorithms and tools and/or
 taxonomy also comes in two basic forms:
 The organism was misidentified. Again, there is no real way
 to fix this other than to fix it at the source.  Sometimes
 a reasonable inference can be made by a good taxonomist, but
 that always comes with risks.
 b) The name used to represent the organism was
 "correct" in the context in which the organism was
 identified, but the name is not consistent with
 "modern" representations of "accepted"
 taxonomy.  There are many reasons for this, such as
 abbreviated or misspelled names, names that are objectively
 unavailable via the relevant Code (e.g., not validly
 published), names that are now widely regarded as
 heterotypic synonyms of other names, names that are
 classified in a different genus from what modern taxonomists
 follow, and text-strings that are really not representative
 of Linnean-style scientific names at all.
 Of these various categories of
 problems, I suspect it's the last one that represents
 the largest portion of the "mess".  The good news
 is that help is on the way.
 If you've got some time, and have an
 interest in this sort of thing, grab a cup of coffee and
 read on. Otherwise, hit "delete" now.
 Still here?  Cool.
 OK, so one of the prototype
 services Rob Whitton and I developed through NSF funding of
 the Global Names Architecture is a service we call
 "real-time taxonomic translation".  Basically,
 this is a service that "translates" taxon names
 into the "modern" equivalent. The best way to
 demonstrate the power of this service is through a specific
 When Rob and I are
 wearing our fish-nerd hats instead of our database-nerd
 hats, we are collaborating with colleagues at NOAA to
 develop a comprehensive checklist of the fishes of the
 Northwestern Hawaiian Islands that is
 "evidence-based" (i.e., occurrence-based with
 explicit evidence supporting each occurrence).  When this
 is published later this year (or possibly early next year),
 I think it will represent a very cool model for how all
 regional organism checklists should be done in the future. 
 But for this Taxacom post, I want to focus on just one small
 component of it:  how real-time taxonomic translation
 So, the
 "evidence" behind the occurrences we are using to
 develop this checklist come from various sources: Museum
 specimens, recorded observations, photos and videos, and, of
 course, historical literature reports.  On the literature
 reports, so far we have captured 2,856 Occurrence records
 based on reports in 24 publications going back 114 years. 
 If we only look at the raw taxon names as they appeared in
 these 24 publications, we get a list of 675 distinct
 scientific names.  Obviously, the prevailing taxonomy has
 changed over these 114 years, so many of those names are not
 consistent with the "modern" interpretation of the
 relevant taxonomy.  It would take many hours of time from
 multiple experts to review all of those 675 names and figure
 out all the corrected spellings, etc.  However, using the
 real-time taxonomic translation service Rob and I developed,
 we can convert these 675 historical names into the 506
 "accepted" names as we would use them today.  And
 it does so in a few seconds (i.e., in "real
 A short
 explanation of how it works is as follows:
 All 2,856 literature-based
 occurrence records are tied to a "Taxon Name
 Usage" (TNU) instance (i.e., the usage of a taxon name
 within a publication). These represent how the original
 publication recorded the name.  For example, what we now
 call Acanthurus triostegus had been variously recorded in
 these literature citations by the following names:
 Acanthurus triostegus (Linnaeus, 1758)
 Hepatus triostegus (Linnaeus, 1758)
 Acanthurus triostegus sandvicensis Streets,
 Hepatus sandvicensis (Streets, 1877)
 Teuthis sandvicensis (Streets, 1877)
 Similarly, what we now call
 Coris flavovittata has been recorded variously as:
 Coris flavovittata (Bennett, 1828)
 Coris lepomis Jenkins, 1901
 Julis eydouxii Valenciennes in Cuvier &
 Valenciennes, 1839
 Julis flavovittata
 Bennett, 1828
 ...and so on
 for all the different names.
 Every TNU is linked to what we call the
 "Protonym" of the name.  This is essentially
 equivalent to the botanical "basionym", but
 essentially represents the original description of the name.
 Taking the second example above, there are three distinct
 Protonyms represented among the four names used for Coris
 flavovittata Bennett, 1828
 eydouxii Valenciennes in Cuvier &
 Valenciennes, 1839
 lepomis Jenkins, 1901
 The taxonomic translation
 service is built around the "Meta-Authority"
 (Authority of Authorities) concept.  A Meta-Authority is
 any organization or individual who wants to assert an
 "accepted" taxonomy.  For example, ITIS, CoL,
 WoRMS, etc. are all Meta-Authorities, because they assert an
 "accepted" usage for each taxon name.  For our
 checklist paper, we have established our own Meta-Authority
 (technically now recorded as the "Rob Whitton
 Meta-Authority, but functionally it is the Bishop Museum
 Meta-Authority). Each Meta-Authority has a specific scope of
 interest -- which might be very large (ITIS, CoL, WoRMS,
 etc.), or might be very small (e.g., a single family or
 geographic region).
 In any
 case, what a Meta-Authority does is, for each name within
 the scope of interest, it makes a statement along the lines
 "For Protonym A, I/We follow the Treatment of Reference
 In this case, The
 Rob Whitton/Bishop Museum Meta-Authority has made these
 - For the protonym
 "flavovittata Bennett, 1828", we follow the
 treatment of Randall 2007 [who treats it as a valid species
 within the genus Coris].
 - For the protonym
 "eydouxii Valenciennes in Cuvier & Valenciennes,
 1839", we follow the treatment of Eschmeyer 2004 [who
 treats it as a junior synonym of flavovittata Bennett,
 - For the protonym "lepomis
 Jenkins, 1901", we follow the treatment of Eschmeyer
 2004 [who treats it as a junior synonym of flavovittata
 Bennett, 1828].
 This is how
 we are able to collapse those messy 675 names spanning 114
 years of taxonomic history into the 506 names that we (the
 experts of the fishes of the Northwestern Hawaiian Islands)
 regard as "accepted" in a few seconds.
 If anyone wants more details
 on how it works, I'd be happy to explain further.
 The main limitations of this
 services are:
 1) It's limited to the
 names within the Global Names Usage Bank (GNUB; currently
 543,989 TNUs linked to 195,369 Protonyms); and
 2) There is currently only one Meta-Authority
 We already have
 funding from NSF to address limitation #1, by developing a
 workflow to capture millions of protonyms and tens of
 millions of TNUs through integrating GNUB, GNI, BHL, and
 multiple other taxonomic data sources.  We also plan to
 expand the Meta-Authority list to include the
 "big" ones (e.g., IT IS/CoL, WoRMS, NCBI), and
 develop tools to make it easy for any individual or
 organization to create their own personal Meta-Authority. 
 And, we just submitted a proposal to NSF to (among other
 things) develop this real-time taxonomic translation service
 into a set of tools that can be very easily applied to any
 list of taxon names.
 If we
 are successful, users of GBIF data will have the option of
 selecting any Meta-Authority they want (one of the big ones,
 or their own), and then be able to translate (in real time)
 all the taxon names as they appear in the GBIF dataset into
 the "accepted" modern/clean equivalent names
 according to the selected Meta-Authority.  And the
 Meta-Authorities aren't just for species-level names --
 they also provide full "accepted" classifications
 all the way up to Kingdom.
 Obviously this won't solve all the problems
 with aggregated data, but it will help solve a lot of it.
 OK, enough for now....
 Richard L.
 Pyle, PhD
 Database Coordinator for Natural
 Associate Zoologist in
 Dive Safety Officer
 Department of Natural Sciences, Bishop
 1525 Bernice St., Honolulu, HI
 Ph: (808)848-4115, Fax:
 email: deepreef at bishopmuseum.org
 Note: This disclaimer formally
 apologizes for the disclaimer below, over which I have no
 Taxacom Mailing List
 Taxacom at mailman.nhm.ku.edu
 The Taxacom Archive back to 1992 may be
 searched at: http://taxacom.markmail.org
 Celebrating 27 years of
 Taxacom in 2014.

More information about the Taxacom mailing list