r.page at bio.gla.ac.uk
Mon Jun 3 17:05:12 CDT 2013
On 3 Jun 2013, at 21:45, <Tony.Rees at csiro.au> wrote:
> OK... so returning to the "Cepa" thread as identified by Chris Thompson ...
> Looking at http://bionames.org/search/Cepa, the problem arises on account of Cepa being a genus-level homonym (twice in both animals and plants), which can be discovered via various sources as previously noted, however this is not reflected in the GBIF backbone taxonomy (http://ecat-dev.gbif.org/search?q=Cepa&rkey=1) to which Rod has pinned a lot of his classification information. So some questions might arise as follows:
> - why does the GBIF taxonomy mangle this group of names (what is specific to this case)
Because the input data is messy, and the aggregation algorithms are not perfect. Also some problems (like homonyms) only become apparent at a global scale.
> - is this likely to be a rare or common event (is it a more general class of error)
I notice it a few times bit, although the GBIF classification improves with time. Would be useful to quantify it.
> - what could assist GBIF to (a) detect this and equivalent problems, and (b) fix them
Gee, how about a database of names linked to the primary literature, ICZN decisions, etc., so GBIF (or anyone else) could investigate further and fix the problem?
> - what could assist Rod or other external user of GBIF data to do the same (i.e. warning signs)
See above ;) Also we could have some simple rules to flag obvious issues (e.g., GBIF can have species with different generic names placed in the same genus which is clearly inconsistent. There are lots of cases of multiple spellings of the same name (you see this in classifications where you have a run of taxa with similar names). Another warning flag is a node, such as a genus, with no descendants.
The problem with fixing is that GBIF assembles the classification anew from multiple "checklists" and without manual intervention. I've discussed this a little with Markus Döring, who suggested having a checklist that was highly weighted and included corrections that would then be applied to the new aggregation. Elsewhere I (and others) have argued that if we placed the GBIF classification under version control we could edit it, merge and fork it like a software project.
> - does BioNames choose wisely in using the GBIF taxonomy, bearing in mind the latter's disclaimers i.e.:
> "Nub Disclaimer: The GBIF Backbone Taxonomy (Nub) is an automatically synthesised management classification with limited manual curating. Information presented here does not represent a consistent taxon but may conflict with other nub "usages" in many cases to a trained taxonomists eye. The information presented on this page was aggregated from the data found in the sources below."
Why GBIF? It's the single biggest classification available that I'm aware of, and it's connected to data that is central to a lot of biodiversity research (see http://www.mendeley.com/groups/1068301/gbif-public-library/ ). People use GBIF to do science, it would be nice if the classification it had was clean and internally consistent. Exposing it to scrutiny and linking it to the primary literature is one way to do this.
> I do have some thoughts about these but no more time just now (shades of Fermat's last theorem) however if others would like to chip in, please do.
> Regards - Tony
>> -----Original Message-----
>> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
>> bounces at mailman.nhm.ku.edu] On Behalf Of Neal Evenhuis
>> Sent: Sunday, 2 June 2013 8:34 PM
>> To: Stephen Thorpe
>> Cc: TAXACOM
>> Subject: Re: [Taxacom] BioNames
>> Stephen --
>> Cepa Humphries 1797 is a nomen nudum -- according to Sherborn -- who
>> DOES list it in his Index Animalium (damn, he was good ... ). Its being
>> an unavailable name is probably why it is not in Neave or WorMS ....
>> This message is only intended for the addressee named above. Its
>> contents may be privileged or otherwise protected. Any unauthorized
>> use, disclosure or copying of this message or its contents is
>> prohibited. If you have received this message by mistake, please
>> notify us immediately by reply mail or by collect telephone call. Any
>> personal opinions expressed in this message do not necessarily
>> represent the views of the Bishop Museum.
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> The Taxacom Archive back to 1992 may be searched with either of these
>> (1) by visiting http://taxacom.markmail.org
>> (2) a Google search specified as:
>> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>> Celebrating 26 years of Taxacom in 2013.
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> The Taxacom Archive back to 1992 may be searched with either of these methods:
> (1) by visiting http://taxacom.markmail.org
> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
> Celebrating 26 years of Taxacom in 2013.
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
ORCID id: http://orcid.org/0000-0002-7101-9767
More information about the Taxacom