[Taxacom] Does the species name have to change when it moves genus?

Roderic Page r.page at bio.gla.ac.uk
Wed Jun 20 05:29:09 CDT 2012


Dear Brian,

The example is nice, and a great demonstration of what we could aspire to. 

A few minor quibbles. It's a pity the example isn't Open Access, that it can't be downloaded in XML for data mining, and that the Names4Life system is patented (http://www.google.com/patents/US7925444 ) (but then again, uBio has a patent as well http://www.google.com/patents/US7650327 ). George Garrity was at the first TDWG GUID meeting (2006?) when Names4Life was already up and running. He found proceedings rather amusing - he'd already settled on DOIs as identifiers.

It's worth thinking about why this system is possible. The situation for bacteria is rather different to, say, animals, for several reasons.

1. There not many named bacteria compared to other taxa (I realise that vastly more bacterial lineages exist than meet the criteria for being named). The Catalogue of Life for 2008 had about 9,500 bacterial species listed. In 2010 there were a total of 859 nomenclatural changes (including new names as well as synonyms, emendations, etc. http://www.bacterio.cict.fr/twothousand/twothousandten.html ). That's compared to 20,068 new zoological names (never mind new combinations, etc.).

2. All bacteria names and changes are published (or announced) in one journal, and articles in that journal have DOIs. We have some attempts to capture new names (ION and ZooBank), a diverse, often obscure literature, and our major journal (Zootaxa) has yet to adopt DOIs.

3. There is commercial value in knowing about bacterial names, e.g. http://intellogist.wordpress.com/2011/06/22/namesforlife-adds-value-to-your-searches/ .  The commercial value in zoological names is perhaps less clear, although providing services to publishers would be an obvious application.

There's no reason, in principle, why we couldn't have the same system for animal names, although the scale of the task is rather larger than for bacteria.

Food for thought...

Regards

Rod


On 20 Jun 2012, at 10:25, Dr.B.J.Tindall wrote:

> Rod,
> While I appreciate that much of what has been discussed is slanted towards those organisms traditionally treated as zoological taxa and botanical taxa, the principles are the same for prokaryotes and viruses.
> 
> Linking information from taxonomic publications back to a central database (that has information on synonyms etc.) that in turn links to other databases is already implemented for the test case of one journal for prokaryote names and their associated taxonomy. Try this link and click on the taxonomic names that are highlighted in green:
> 
> http://ijs.sgmjournals.org/content/62/Pt_6/1342.abstract
> 
> The system also uses GUIDs.
> 
> The various issues of whether a genus name should change when a species is reclassified in another genus and the use of LSID/DOIs sometimes looks different once a system operates and one starts to ask questions that the system isn't set up to solve. Only then does one get a clear idea of what questions one should have asked in the first place.
> 
> Brian
> 
> Quoting Roderic Page <r.page at bio.gla.ac.uk>:
> 
>> Dear Rich,
>> 
>> Thanks for the insightful response. I agree that we are in a transitional period (from analogue to digital), and things will be messy for a while (as we can see by looking at what's happened as the music, newspaper, and movie industries have struggled with this transition). Of course, I want things to go much faster, in part because I'm not convinced that the rest of biology will hang around patiently waiting while taxonomy makes this transition.
>> 
>> As William Gibson wrote "the future is already here -- it's just not very evenly distributed" http://en.wikiquote.org/wiki/William_Gibson ZooKeys is one vision of the future, and it's a great journal, it's just a tiny fraction of the current taxonomic output.  In 2010 ION recorded around 18,000 new names, of which a little over 400 were published in ZooKeys. The widely distributed nature of taxonomic publishing means that even the dominant journal (Zootaxa) publishes only 15-20% of all new names. That means there's a lot of publishers to convince to move to new ways of publishing (e.g., XML markup). If we're publishing ~20,000 animal names a year, and the rate of new species description has been relatively constant over the last few years, we still have 50 years to go to double the number of animal names we already have.
>> 
>> You ask
>> 
>>> Why
>>> not focus on the brave new world of electronic publication, where the
>>> included information is not limited to what is seen by the human eyeballs,
>>> and where all this stuff can be harvested seamlessly via unambiguous GUIDs
>>> (rather than via crude OCR and fuzzy text matching of messy text-string
>>> names)?  To me, that's where the answer lies going forward.
>> 
>> Partly because the new stuff is relatively easy, in the sense that marked up XML is straightforward to generate and work with (where's the fun in that?), and partly because once the new stuff is marked up, we still have the question "what does it link to?" If I click on a name, I want to see everything about that name, not just the most recent literature. If an article cites a paper that is only available in a digital library like BHL, I want to have a link to that. If the user clicks on that link, I want them to be able to view the BHL content in much the same way as the more recent literature (i.e., names and literature cited also clickable). Our legacy literature will vastly outweigh the newly generated, nicely marked up literature for a while yet.
>> 
>> 
>>>> I'm not proposing a change in how we cite names,
>>> 
>>> Yes, you are.  Since Linnaeus, we have cite species names in the form of a
>>> species epithet preceded by a genus name, and the genus name is selected to
>>> reflect the classification of the epithet.  You're proposing that we strip
>>> the classification information from the binomial, and treat it as though
>>> it's a single text string that happens to include a space within it.  That's
>>> a change in how we cite names.  Maybe we're operating on different
>>> definitions of the word "cite"?
>> 
>> 
>> I suspect we are. To me, if I write "Homo sapiens" that's the name. I gather you are saying that actually the name is "sapiens" which I'm citing in the context of the genus "Homo" to indicate its classification. That may well be, but I'd argue it's not how most biologists treat names in practice (i.e., as simple text strings to refer to a taxon).
>> 
>>> 
>>> So in that context
>>> (if you really believe that's where it's heading), why try to change one of
>>> the core principles of binomial nomenclature right at the twilight of its
>>> tenure in science?
>> 
>> That's a very good question.
>> 
>> Regards
>> 
>> Rod
>> 
>> On 19 Jun 2012, at 21:02, Richard Pyle wrote:
>> 
>>>> I'm not actually suggesting we change the code, merely
>>>> the convention to automatically  change the genus name
>>>> if a species moves.
>>> 
>>> I never said anything about changing the Code.  I was talking about 250
>>> years of conventional practice.
>>> 
>>>> Trivial in the sense that if we had all the information on names
>>>> (synonyms included) linked to types we could resolve synonyms
>>>> computationally, but we don't, certainly not at the scale of the
>>>> 10x5 - 10x6 names that databases such as NCBI taxonomy and
>>>> GBIF deal with. Yes, in principle, it is tractable, but why do
>>>> we contribute to creating the situation in the first place?
>>> 
>>> The "situation" is only as such because of the advent of computer systems.
>>> Prior to computers, people had no issue with it.  Now that we have
>>> technology that lets us mine vast volumes of historical literature, the
>>> situation appears problematic because we are now doing at a much larger
>>> scale something that hasn't really been done before, other than by Sherborn,
>>> Neave, Linneaus, and a few others.
>>> 
>>> I've always maintained that this period in history -- starting in the mid-
>>> to late 1990s and ending probably around 2020-2025 -- is a somewhat awkward
>>> one because it represents the transition between a centuries-long era when
>>> paper was the primary mode of information dissemination and documentation
>>> among humans, and the era when information is mostly documented and
>>> disseminated via electronic signals.  It's awkward in general because not
>>> all systems are moving in parallel and at the same pace, so various systems
>>> get out of phase (e.g., the desire by modern taxonomists to publishe new
>>> scientific names in electronic journals, vs. -- until recently -- the
>>> requirements of the various Codes).  Specific to this topic, the awkwardness
>>> relates to the fact that we have computer systems that index large numbers
>>> of taxonomic name-strings (messy things, that don't always make it clear
>>> which binomials are homotypic), but don't yet have computer systems that
>>> index large numbers of taxonomic name "objects" (where homotypic binomials
>>> are easily identified -- plus much, much more).  By about 2050 or so, this
>>> period of awkwardness will warrant only a footnote.
>>> 
>>>> Personally I suspect the taxon concept issue isn't going to be worth
>>>> the effort expended, unless tackled with some clever tools for inferring
>>>> context from citation, etc. it's simply unscalable.
>>> 
>>> I'm not so sure about that.  Consider that we're rapidly shifting from
>>> paper-based publication to electronic publication.  To make a major change
>>> in practice in how we render text strings to human eyeballs (i.e., your
>>> proposal to stop changing binomial combinations so we can maintain more
>>> stable text-string labels) seems like using a Ferrari to drive down the
>>> block to the local grocery store a little faster than the Toyota used to get
>>> us there.  I imagine a world (not hard to imagine, really -- given journals
>>> like ZooKeys) where the text string name is something just to render on the
>>> computer monitor for the benefit of human eyeballs.  Behind that text-string
>>> name would be the necessary electronic link (or links) to the taxonomic
>>> object, complete with all the metadata any taxonomist would ever want (type
>>> specimens, concept definitions, literature citations, historical trends in
>>> usages, etc., etc., etc.).
>>> 
>>> You're talking about changing how we do things going forward, right?  Why
>>> not focus on the brave new world of electronic publication, where the
>>> included information is not limited to what is seen by the human eyeballs,
>>> and where all this stuff can be harvested seamlessly via unambiguous GUIDs
>>> (rather than via crude OCR and fuzzy text matching of messy text-string
>>> names)?  To me, that's where the answer lies going forward.  Not putting a
>>> band-aid on a centuries-old practice that is still trying to use text-string
>>> names as unique identifiers.
>>> 
>>>> Why not focus on what is tractable and will add immediate value?
>>> 
>>> It would appear that you and I have different ideas about what is tractable
>>> (or maybe we define the word differently?)
>>> 
>>>> I'm not proposing a change in how we cite names,
>>> 
>>> Yes, you are.  Since Linnaeus, we have cite species names in the form of a
>>> species epithet preceded by a genus name, and the genus name is selected to
>>> reflect the classification of the epithet.  You're proposing that we strip
>>> the classification information from the binomial, and treat it as though
>>> it's a single text string that happens to include a space within it.  That's
>>> a change in how we cite names.  Maybe we're operating on different
>>> definitions of the word "cite"?
>>> 
>>>> and suggestions that embed more semantics in names (such as author, date,
>>> first name)
>>>> are just asking for trouble http://bit.ly/KQ6o46
>>> 
>>> Perhaps.  But why not instead leverage the move towards electronically
>>> published works by embedding GUIDs (which are themselves hidden from human
>>> eyeballs), rather than continue to overload the feeble text string with more
>>> information than it's capable of representing?
>>> 
>>>> Citing a reference for "what I mean by" is useful, but I'd be
>>>> happier if that was linked to actual data.
>>> 
>>> We're in agreement on this one -- the key word being "linked".
>>> 
>>>> Ironically, if you read the tea leaves the way I do, we are
>>>> moving to a biodiversity science without names, where
>>>> specimens will be the unit of choice, and taxa will be
>>>> computational inferences, not vague assertions supported
>>>> by a citation at best. But that's another story...
>>> 
>>> ...and I wouldn't necessarily disagree with that story.  So in that context
>>> (if you really believe that's where it's heading), why try to change one of
>>> the core principles of binomial nomenclature right at the twilight of its
>>> tenure in science?
>>> 
>>> Aloha,
>>> Rich
>>> 
>>> 
>>> Richard L. Pyle, PhD
>>> Database Coordinator for Natural Sciences
>>> Associate Zoologist in Ichthyology
>>> Dive Safety Officer
>>> Department of Natural Sciences, Bishop Museum
>>> 1525 Bernice St., Honolulu, HI 96817
>>> Ph: (808)848-4115, Fax: (808)847-8252
>>> email: deepreef at bishopmuseum.org
>>> http://hbs.bishopmuseum.org/staff/pylerichard.html
>>> 
>>> Note: This disclaimer formally apologizes for the disclaimer below, over
>>> which I have no control.
>>> 
>>> 
>>> 
>>> 
>>> This message is only intended for the addressee named above.  Its contents may be privileged or otherwise protected.  Any unauthorized use, disclosure or copying of this message or its contents is prohibited.  If you have received this message by mistake, please notify us immediately by reply mail or by collect telephone call.  Any personal opinions expressed in this message do not necessarily represent the views of the Bishop Museum.
>>> 
>> 
>> ---------------------------------------------------------
>> Roderic Page
>> Professor of Taxonomy
>> Institute of Biodiversity, Animal Health and Comparative Medicine
>> College of Medical, Veterinary and Life Sciences
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QQ, UK
>> 
>> Email: r.page at bio.gla.ac.uk
>> Tel: +44 141 330 4778
>> Fax: +44 141 330 2792
>> Skype: rdmpage
>> AIM: rodpage1962 at aim.com
>> Facebook: http://www.facebook.com/profile.php?id=1112517192
>> Twitter: http://twitter.com/rdmpage
>> Blog: http://iphylo.blogspot.com
>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>> 
>> _______________________________________________
>> 
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>> 
>> The Taxacom archive going back to 1992 may be searched with either of these methods:
>> 
>> (1) by visiting http://taxacom.markmail.org
>> 
>> (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>> 
> 
> 
> 
> Dr.B.J.Tindall
> Leibniz-Institut DSMZ-Deutsche Sammlung von
> Mikroorganismen und Zellkulturen GmbH
> Inhoffenstra├če 7B
> 38124 Braunschweig
> Germany
> Tel. ++49 531-2616-224
> Fax  ++49 531-2616-418
> http://www.dsmz.de
> Director: Prof. Dr. J. Overmann
> Local court: Braunschweig HRB 2570
> Chairman of the management board: MR Dr. Axel Kollatschny
> 
> DSMZ - A member of the Leibniz Association (WGL)
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> 

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
Skype: rdmpage
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html




More information about the Taxacom mailing list