[Taxacom] ZooBank Progress
pleuronaia at gmail.com
Mon Apr 29 09:49:06 CDT 2013
Although it's generally rare, there are a number of extant mollusk species
originally described from fossils, e.g. Crassinella dupliniana, Chama
congregata (if the living really is the same).
On Sun, Apr 28, 2013 at 3:08 PM, Richard Pyle <deepreef at bishopmuseum.org>wrote:
> > If it's done manually it might be worth to correct some other names that
> > were linked by ZooBank to the Linnean 1758 work.
> Yes, exactly! Part of the process of cross-linking is to compare
> discrepancies. We've found that in most datasets that we cross-link
> against, there are a relatively small fraction of discrepancies. For
> example, out of 50,000 names, there might be only a few hundred
> discrepancies -- usually involving the date of publication, correct
> authorship, or the exact orthography of the name. This means that it's a
> very manageable task to investigate each one of the discrepancies. Also,
> I've found that no database is perfect. Some are better than others, to be
> sure -- but one can never assume that one database is always correct --
> which means that it's important to examine the discrepancies individually.
> Indeed, this is one of the main reasons why we wanted to establish this
> link with BHL -- to make it easier to resolve discrepancies.
> > My understanding is that ZooBank is a data resource where available names
> > are contained. Unavailable names should probably not be contained at all,
> > and if yes, they should clearly be marked as such.
> > I am not sure how names should be treated which were initially made
> > available and later suppressed.
> This is an issue that has been debated since ZooBank was first conceived.
> In 2008, at a Commissioners meeting in Paris, it was determined that
> ZooBank *would* include unavailable names, and that those names would be
> clearly marked not only as unavailable, but also give the reason(s) why the
> name is unavailable. We already have a very robust data model to deal with
> this (which I'd be happy to describe, if anyone is interested). But as
> with most aspects of ZooBank development, the tricky part is how to
> implement it (devil is always in the details). One of the things in the
> works is a policy on data verification in ZooBank. Right now, the focus is
> on building the core infrastructure of ZooBank, populating it with
> restrospective content, and building tools to streamline the capture of
> prospective content. However, what people *really* want from ZooBank is a
> definitive declaration of whether or not any particular name is available
> under the Code. This is the entire process of content verification. So
> far, we focused only on registration (these are two very different things).
> > Example:
> > Acarus telarius Linnæus, 1758 - this name should somehow be marked as
> > suppressed (ICZN Op. 968).
> > There were many other such names established in the 1758 work, which
> > were totally or partly suppressed by the Commission.
> Yes, indeed! In fact, one of the projects we've been working on (with
> LARGE thanks to Charles Hussey, and also to Rod Page who defined the
> article boundaries of historical BZN volumes in BHL) is a complete database
> of Opinions. This is effectively complete (still needs some verification,
> though), and will be one of the new features added to ZooBank this summer.
> But again, we need to sort out exactly how this sort of thing will be
> implemented on the ZooBank website, and what the policy is for editing
> these sorts of things, etc.
> Many thanks for pointing out the individual issues related to Linneaus
> names. This sort of thing is EXTREMELY helpful! I will definitely use
> these as test cases when we implement the next set of features involving
> ZooBank record verification/validation. But again, it probably can't be
> implemented until later this summer (northern hemisphere summer, that is).
> > Maybe some other systematic things could be fixed.
> > - Remove the long s throughout the original spellings, and replace it at
> > instances by the short s.
> In this case, we want to maintain the precise orthography as it originally
> appeared on the printed page -- in al respects. Basically, if a UTF-8
> character exists for a particular glyph, we want to capture it as such.
> The main exceptions are that all-caps words are not faithfully captured as
> such, and other stylistic attributes (e.g., boldface, small-caps, when
> original names were not italicized, etc.) will not be captured. But
> characters such as the long s and dipthong "æ" will be captured as
> originally printed on the page.
> The next step is to build the correct algorithm to transform these things,
> so that the Code-corrected "original spelling" can be generated
> automatically. In most cases, this is easy to do -- but there are some
> tricky ones (e.g., see Art. 126.96.36.199. -- which would require us to know
> whether the root word is German or not; or some of Art. 188.8.131.52.). This
> is one more example of features currently in the works, that will be
> introduced over time as they rise up the priority list, and as appropriate
> policies are drafted and ratified.
> > Example Musca Linnæus, 1758, this name was
> > spelled Musca with long s at some occasions and MUSCA at others, MUSCA is
> > usually converted to Musca with short s. So all specific names should
> > correctly be combined with Musca with short s.
> This is a slightly separate issue (multiple spellings of the same genus
> name, and how they map to the species they are combined with). The new
> GNUB data model (not yet implemented) deals with this by capturing
> separately the verbatim name-string, and the separate name components. At
> the moment, this sort of issue is rare enough that it has not risen up the
> priority "to-do" list. But it's definitely on the list.
> > Also, the long s is not cited consistently. Example:
> > Ostrea Puſio Linnæus, 1758 - here Pusio with long s and Ostrea with
> > short s, both had the long s in the original source.
> This is another example of the previous. The genus was rendered as OSTREA
> on p. 696 (http://www.biodiversitylibrary.org/pagethumb/727611), so the
> genus is captured as such in the database (minus the all-caps). I only see
> "Oſtrea" in the page header. Is it rendered this way somewhere else?
> > - Consider presenting a field "original spelling" and another field
> > spelling". This would probably reduce confusion. In the correct spelling
> > the species would not appear capitalised, and diacritics would be
> Yes! This is already part of the plan. It just needs to rise up the
> priority list for implementation.
> > - I am confused by the statement "Fossil: No" in the ZooBank data result
> > Is this nomenclaturally relevant?
> It's not a Code-relevant issue, but it is a useful piece of information
> (just like type locality, figures, and page number).
> > Is there an exact definition for the term "fossil"? Since when does a
> > need to be extinct for obtaining the attribute "fossil"?
> If you read the help section for this particular field (click on the blue
> icon when registering a new name, or editing an existing name), it explains
> it thusly:
> "If this new name is based on fossil material, select this checkbox.
> Otherwise, leave the checkbox unselected."
> In other words, it only applies to species-group names, and it is a
> specific indication of the nature of the name-bearing type material.
> Technically, if the type specimen of Latimeria chalumnae had been a
> fossil, and then it was later discovered alive, this would be "Fossil:
> Yes". However, I am not aware of any case where a name is established
> based on a fossilized type, and then later discovered (at the species
> level) to be extant. Generally such cases are described as separate
> > Can we be sure that all molluscs and brachiopods named in the early
> > works were recent?
> Nope. Neither can we be sure that all the page numbers are correct, or
> all the type localities are correct -- or any number of other things. That
> doesn't mean the data field should be eliminated. It just means we have to
> deal with cases that prove to be inaccurate (or unknown).
> > Would it not be better to remove the statement, to avoid running the
> risk to
> > give an incorrect information?
> I don't think so, but I'd be interested in hearing opinions from others on
> this. As I already said, there is no such thing as a perfect database.
> One of the things Rob Whitton constantly reminds me of is not to let the
> "perfect" be the enemy of the "good". I tend to be a perfectionist on
> thses sorts of things (as many database managers are). But sometimes it's
> better to just get what you have out there, and then provide a
> crowd-sourcing mechanism to get it corrected.
> > Example:
> > Anomia Gryphus Linnæus, 1758. Here a fossil species was described, and in
> > Zoobank it was marked as "Fossil: No".
> Many thanks for the correction! I have already implemented it on ZooBank
> (it took me 7 seconds to correct this -- but you did the hard part of
> finding the error, and made it extremely easy for me by providing the link).
> I want to thank you again for providing all of these VERY VALUABLE
> corrections to names in ZooBank. I will study them in more detail (along
> with your other recent messages), and will likely come back to you with
> follow-up questions.
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> The Taxacom Archive back to 1992 may be searched with either of these
> (1) by visiting http://taxacom.markmail.org
> (2) a Google search specified as: site:
> mailman.nhm.ku.edu/pipermail/taxacom your search terms here
> Celebrating 26 years of Taxacom in 2013.
Dr. David Campbell
Assistant Professor, Geology
Department of Natural Sciences
Boiling Springs NC 28017
More information about the Taxacom