[Taxacom] ZooBank Progress

David Campbell pleuronaia at gmail.com
Mon Apr 29 09:49:06 CDT 2013


Although it's generally rare, there are a number of extant mollusk species
originally described from fossils, e.g. Crassinella dupliniana, Chama
congregata (if the living really is the same).


On Sun, Apr 28, 2013 at 3:08 PM, Richard Pyle <deepreef at bishopmuseum.org>wrote:

>
> > If it's done manually it might be worth to correct some other names that
> > were linked by ZooBank to the Linnean 1758 work.
>
> Yes, exactly!  Part of the process of cross-linking is to compare
> discrepancies.  We've found that in most datasets that we cross-link
> against, there are a relatively small fraction of discrepancies.  For
> example, out of 50,000 names, there might be only a few hundred
> discrepancies -- usually involving the date of publication, correct
> authorship, or the exact orthography of the name.  This means that it's a
> very manageable task to investigate each one of the discrepancies. Also,
> I've found that no database is perfect.  Some are better than others, to be
> sure -- but one can never assume that one database is always correct --
> which means that it's important to examine the discrepancies individually.
>  Indeed, this is one of the main reasons why we wanted to establish this
> link with BHL -- to make it easier to resolve discrepancies.
>
> > My understanding is that ZooBank is a data resource where available names
> > are contained. Unavailable names should probably not be contained at all,
> > and if yes, they should clearly be marked as such.
> > I am not sure how names should be treated which were initially made
> > available and later suppressed.
>
> This is an issue that has been debated since ZooBank was first conceived.
>  In 2008, at a Commissioners meeting in Paris, it was determined that
> ZooBank *would* include unavailable names, and that those names would be
> clearly marked not only as unavailable, but also give the reason(s) why the
> name is unavailable.  We already have a very robust data model to deal with
> this (which I'd be happy to describe, if anyone is interested).  But as
> with most aspects of ZooBank development, the tricky part is how to
> implement it (devil is always in the details).  One of the things in the
> works is a policy on data verification in ZooBank.  Right now, the focus is
> on building the core infrastructure of ZooBank, populating it with
> restrospective content, and building tools to streamline the capture of
> prospective content.  However, what people *really* want from ZooBank is a
> definitive declaration of whether or not any particular name is available
> under the Code.  This is the entire process of content verification.  So
> far, we focused only on registration (these are two very different things).
>
> > Example:
> >
> http://zoobank.org/NomenclaturalActs/1E691819-76A8-492D-8AE1-DA84F9103CF8
> > Acarus telarius Linnæus, 1758 - this name should somehow be marked as
> > suppressed (ICZN Op. 968).
> > There were many other such names established in the 1758 work, which
> > were totally or partly suppressed by the Commission.
>
> Yes, indeed!  In fact, one of the projects we've been working on (with
> LARGE thanks to Charles Hussey, and also to Rod Page who defined the
> article boundaries of historical BZN volumes in BHL) is a complete database
> of Opinions.  This is effectively complete (still needs some verification,
> though), and will be one of the new features added to ZooBank this summer.
>  But again, we need to sort out exactly how this sort of thing will be
> implemented on the ZooBank website, and what the policy is for editing
> these sorts of things, etc.
>
> Many thanks for pointing out the individual issues related to Linneaus
> names.  This sort of thing is EXTREMELY helpful!  I will definitely use
> these as test cases when we implement the next set of features involving
> ZooBank record verification/validation.  But again, it probably can't be
> implemented until later this summer (northern hemisphere summer, that is).
>
> > Maybe some other systematic things could be fixed.
> >
> > - Remove the long s throughout the original spellings, and replace it at
> all
> > instances by the short s.
>
> In this case, we want to maintain the precise orthography as it originally
> appeared on the printed page -- in al respects.  Basically, if a UTF-8
> character exists for a particular glyph, we want to capture it as such.
>  The main exceptions are that all-caps words are not faithfully captured as
> such, and other stylistic attributes (e.g., boldface, small-caps, when
> original names were not italicized, etc.) will not be captured.  But
> characters such as the long s and dipthong "æ" will be captured as
> originally printed on the page.
>
> The next step is to build the correct algorithm to transform these things,
> so that the Code-corrected "original spelling" can be generated
> automatically.  In most cases, this is easy to do -- but there are some
> tricky ones (e.g., see Art. 32.5.2.1. -- which would require us to know
> whether the root word is German or not; or some of Art. 32.5.2.4.).  This
> is one more example of features currently in the works, that will be
> introduced over time as they rise up the priority list, and as appropriate
> policies are drafted and ratified.
>
>
> > Example Musca Linnæus, 1758, this name was
> > spelled Musca with long s at some occasions and MUSCA at others, MUSCA is
> > usually converted to Musca with short s. So all specific names should
> > correctly be combined with Musca with short s.
>
> This is a slightly separate issue (multiple spellings of the same genus
> name, and how they map to the species they are combined with).  The new
> GNUB data model (not yet implemented) deals with this by capturing
> separately the verbatim name-string, and the separate name components.  At
> the moment, this sort of issue is rare enough that it has not risen up the
> priority "to-do" list.  But it's definitely on the list.
>
> > Also, the long s is not cited consistently. Example:
> >
> http://zoobank.org/NomenclaturalActs/D2B4DA70-35AE-4D87-9E34-E219FC8E3DA0
> > Ostrea Puſio Linnæus, 1758 - here Pusio with long s and Ostrea with
> > short s, both had the long s in the original source.
>
> This is another example of the previous.  The genus was rendered as OSTREA
> on p. 696 (http://www.biodiversitylibrary.org/pagethumb/727611), so the
> genus is captured as such in the database (minus the all-caps).  I only see
> "Oſtrea" in the page header.  Is it rendered this way somewhere else?
>
> > - Consider presenting a field "original spelling" and another field
> "correct
> > spelling". This would probably reduce confusion. In the correct spelling
> field
> > the species would not appear capitalised, and diacritics would be
> removed.
>
> Yes!  This is already part of the plan.  It just needs to rise up the
> priority list for implementation.
>
> > - I am confused by the statement "Fossil: No" in the ZooBank data result
> set.
> > Is this nomenclaturally relevant?
>
> It's not a Code-relevant issue, but it is a useful piece of information
> (just like type locality, figures, and page number).
>
> > Is there an exact definition for the term "fossil"? Since when does a
> taxon
> > need to be extinct for obtaining the attribute "fossil"?
>
> If you read the help section for this particular field (click on the blue
> icon when registering a new name, or editing an existing name), it explains
> it thusly:
>
> "If this new name is based on fossil material, select this checkbox.
> Otherwise, leave the checkbox unselected."
>
> In other words, it only applies to species-group names, and it is a
> specific indication of the nature of the name-bearing type material.
>  Technically, if the type specimen of Latimeria chalumnae had been a
> fossil, and then it was later discovered alive, this would be "Fossil:
> Yes".  However, I am not aware of any case where a name is established
> based on a fossilized type, and then later discovered (at the species
> level) to be extant.  Generally such cases are described as separate
> species.
>
> > Can we be sure that all molluscs and brachiopods named in the early
> Linnean
> > works were recent?
>
> Nope.  Neither can we be sure that all the page numbers are correct, or
> all the type localities are correct -- or any number of other things.  That
> doesn't mean the data field should be eliminated.  It just means we have to
> deal with cases that prove to be inaccurate (or unknown).
>
> > Would it not be better to remove the statement, to avoid running the
> risk to
> > give an incorrect information?
>
> I don't think so, but I'd be interested in hearing opinions from others on
> this.  As I already said, there is no such thing as a perfect database.
>  One of the things Rob Whitton constantly reminds me of is not to let the
> "perfect" be the enemy of the "good".  I tend to be a perfectionist on
> thses sorts of things (as many database managers are).  But sometimes it's
> better to just get what you have out there, and then provide a
> crowd-sourcing mechanism to get it corrected.
>
> > Example:
> >
> http://zoobank.org/NomenclaturalActs/04B5D5F4-648A-489F-ADE9-2C13971F8A69
> > Anomia Gryphus Linnæus, 1758. Here a fossil species was described, and in
> > Zoobank it was marked as "Fossil: No".
>
> Many thanks for the correction!  I have already implemented it on ZooBank
> (it took me 7 seconds to correct this -- but you did the hard part of
> finding the error, and made it extremely easy for me by providing the link).
>
> I want to thank you again for providing all of these VERY VALUABLE
> corrections to names in ZooBank.  I will study them in more detail (along
> with your other recent messages), and will likely come back to you with
> follow-up questions.
>
> Aloha,
> Rich
>
>
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as:  site:
> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>
> Celebrating 26 years of Taxacom in 2013.
>



-- 
Dr. David Campbell
Assistant Professor, Geology
Department of Natural Sciences
Gardner-Webb University
Boiling Springs NC 28017



More information about the Taxacom mailing list