pentcheff at gmail.com
Mon Sep 21 12:42:36 CDT 2009
[By permission, forwarding Paul Kirk's reply to the list (see below),
No one mentioned databases and primary keys until I did -- I didn't
mean to imply that you had (or that databases and primary keys had
anything to do with your discussion until I brought it in).
I've had the experience, though, of working with a range of biologists
(confessing: myself included) who thought it would be most clever to
use already-available "unique" information about data to create the
primary key for it. After all, then the key is doubly-valuable, isn't
it? It works as a key and I can read it and get information: whee!
In just about every case, we've regretted that decision, realizing
retroactively that what we really wanted was a unique primary key. No
bonus information, please. No embedded taxon names (that change on a
re-identification). No embedded literature reprint number (that
changes when we realize we used the wrong paper). No embedded year of
collection (that changes when we realize that it was the date of
shipping, not the date of collection). Just a simple number that can
stay a simple number, thankyouverymuch.
I also didn't mean to imply that it isn't possible to use a
human-readable key well: IPNI does indeed cross-reference and search
names well, immediately providing the correct author key for further
search. I have no problem with that at all!
But that's not a database whose underlying assumptions are being
invented today. It's built on an already-existing database-esque
convention of how authors' names are used in botany. And it does an
excellent job of doing that.
Yes, you're right, I'm a zoologist. More than that, I'm a jealous
zoologist: I wish that zoology had adopted a scheme like the botanists
did! It would facilitate a much, much smoother transition to a more
pentcheff at gmail.com
On Mon, Sep 21, 2009 at 10:28 AM, Paul Kirk <p.kirk at cabi.org> wrote:
> Sorry I cannot reply to the list from home - my employers mail system
> automatically adds content which the taxacom prohibits and the reply is
> So, who mentioned databases and primary keys? The string of characters is
> for human consumption to disambiguate. If you want to see the primary keys
> go to http://www.ipni.org/ipni/authorsearchpage.do and search for Smith then
> select one. Your reply assumes too much - get the fact right first ... ;-)
> ... I assume you are a zoologist
> You may post this reply to the list.
> From: Dean Pentcheff [mailto:pentcheff at gmail.com]
> Sent: Mon 21/09/2009 18:18
> To: Taxacom at mailman.nhm.ku.edu
> Cc: Paul Kirk; gread at actrix.gen.nz; Richard Pyle
> Subject: Re: [Taxacom] globalnames?
> On a side note, this discussion is a brilliant example of why database
> geeks push hard to use "arbitrary" unique keys, instead of concocting
> keys that include human-useful information.
> What the botanical community has done (as I understand it) is create a
> system of unique keys for taxonomic authors. The keys take the form of
> a standardized (and unique) abbreviation for every author. So one
> particular "Smith" person is identified uniquely as "Sm.". Presumably
> another author sharing the same last name might be identified uniquely
> as "Smi.", and so forth. (I have no idea what you guys do with the
> 53rd "Smith", but fortunately that's not my problem.)
> The data-nerd's approach would have been to use something like
> "184746" for that Smith, and "736659" for the other Smith. Sigh, you
> say. Ugly. Opaque.
> The advantage to the standard-abbreviation system is that, by using a
> human-readable abbreviation as the unique key, the experienced reader
> is able to (in most cases) mentally substitute the proper name
> The disadvantage is that it's just too close to colloquial usage to be
> safe. Geoffrey Read finds that particular abbreviation silly, so would
> probably use "Smith" instead. That might be the standardized key for a
> different Smith. If I didn't know about the standardized rules in use
> and needed to mention "Pseudoplantus fakeus
> Milne-Edan-Smytheson-James", I might try to save my typing fingers and
> simply make up the abbreviation "Milne-Ed.". But maybe that actually
> refers to Mr. Milne-Edwards.
> Whenever a key includes "useful" information in addition to its core
> purpose as a unique identifier, it becomes prone to well-intentioned
> meddling, hence much less reliable as an actual unique identifier.
> Dean Pentcheff
> pentcheff at gmail.com
> On Sun, Sep 20, 2009 at 11:39 PM, Paul Kirk <p.kirk at cabi.org> wrote:
>> You conveniently didn't answer the question - which was - would you
>> support (promote) the use of 'Smith' for all 120 Smith's rather than an
>> unambiguous abbreviation?
>> -----Original Message-----
>> From: Geoffrey Read [mailto:gread at actrix.gen.nz]
>> Sent: 19 September 2009 23:55
>> To: Paul Kirk
>> Cc: Taxacom at mailman.nhm.ku.edu
>> Subject: RE: [Taxacom] globalnames?
>> The abbreviation 'Sm.' for Smith is astoundingly silly. You don't really
>> do that do you? Oh dear.
>> On not showing authors in a species page - depends on who the end user
>> will be, but if hidden for simplicity for those you think would have
>> little interest in them they should be there in the background to be
>> conjured up instantly for the many who do require them to make sense of
>> the name.
>> On Fri, September 18, 2009 10:44 pm, Paul Kirk wrote:
>>> Author abbreviations: punctuation (and diacriticals) in author
>>> abbreviations is ignored in determining an appropriate form to avoid
>>> ambiguity thus any author named Bull would have initials added to
>>> distinguish them. However, to follow your 'logic' forward ... Bulliard
>>> is indeed unique and easily recognized even for the untutored user but
>>> am I correct in assuming you would support the use of Smith rather
>>> than the unambiguous Sm., A.H.Sm., J.Sm. etc, etc, for all 120 Smiths
>>> who are plant name authors? And Smith is not now the most common
>>> surname/family name!
>>> Incidentally, I am against including author citations anywhere except
>>> in hard core nomenclature and taxonomy - most users, untutored or
>>> otherwise, should not see them - that they continue to be used (often
>>> copied from one incorrect source to another) outside the areas
>>> mentioned is indeed a "relic of ancient times".
>> Find out about CABI's global summit on 'Food security in a climate of
>> change' at www.cabiglobalsummit.com
>> 19 - 21 October 2009, London, UK.
>> The information contained in this e-mail and any files transmitted with it
>> is confidential and is for the exclusive use of the intended recipient. If
>> you are not the intended recipient please note that any distribution,
>> copying or use of this communication or the information in it is prohibited.
>> Whilst CAB International trading as CABI takes steps to prevent the
>> transmission of viruses via e-mail, we cannot guarantee that any e-mail or
>> attachment is free from computer viruses and you are strongly advised to
>> undertake your own anti-virus precautions.
>> If you have received this communication in error, please notify us by
>> e-mail at cabi at cabi.org or by telephone on +44 (0)1491 829199 and then
>> delete the e-mail and any copies of it.
>> CABI is an International Organization recognised by the UK Government
>> under Statutory Instrument 1982 No. 1071.
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> The Taxacom archive going back to 1992 may be searched with either of
>> these methods:
>> (1) http://taxacom.markmail.org
>> Or (2) a Google search specified as:
>> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
More information about the Taxacom