[Taxacom] ZooBank Progress
deepreef at bishopmuseum.org
Sat Apr 27 04:45:49 CDT 2013
Well. for one thing, we would have created the link to the book back in late
2007, as we were preparing to launch ZooBank (the original ZooBank included
these links, and also included all names within Sys. Nat 10th). I notice
that the record we link to (TitleID=542) was created in BHL on May 4, 2006
(and the pages scanned on September 11, 2007); whereas the one you're
linking to (TitleID=35518) was created on November 15, 2009. Hence, that
one would not have been available at the time we first established the link.
Also, I notice one you link to appears to be a reprint, published in 1894
I would rather link to the original, and then bring these sorts of
page-offset issues to BHL's attention and have them fix the page indexing (I
just now reported it to them via their feedback mechanism). The good news is
that , except for two duplicated pages (576 and 577), all the PageIds should
be persistent, so by linking to the correct PageID, the right page number
metadata will be applied eventually, without me needing to change the link
to ZooBank. In other words, by linking to
http://biodiversitylibrary.org/page/727530, I'm sure to get the right page
(p. 615) into perpetuity (both now when it's indexed as p.617, and later
when it's corrected to p.615).
I agree with you that the services BHL provides are awesome, and that these
sorts of issues are downright trivial compared to the amazing (and amazingly
accurate) resource they provide. One of the things we discussed in Hawaii
last month was the distinction between "physical" items and "conceptual"
things in BHL. In some cases, if a book is scanned by two different
libraries in the BHL network, there will be two separate TitleID's assigned
to the same conceptual object. Obviously, it's important to keep each
scanned copy (especially if there are marginalia or other distinguishing
aspects of a particular scanned copy); but it's also useful, I think, to
collapse these as separate "Items" within the same Title.
But this is a relatively rare problem (i.e., an "edge case"). In the vast
majority of cases, the records are correct.
From: Roderic Page [mailto:r.page at bio.gla.ac.uk]
Sent: Friday, April 26, 2013 10:42 PM
Cc: Richard Pyle
Subject: Re: [Taxacom] ZooBank Progress
Any reason ZooBank chose that particular copy of "Systema Naturae" to link
One of the joys/pitfalls of BHL (which is an awesome resource), is the
presence of multiple copies of the same thing, coupled with occasional
problems such as page numbers being out of sync with images, or (my personal
favourite) the page scans and OCR text being out of sync with each other
(i.e., the page image and OCR text are for different pages).
There are other copies of Systema Naturae that don't seem to have this
problem, e.g. http://biodiversitylibrary.org/page/25034426 (I haven't
checked all the pages in this scan).
On 27 Apr 2013, at 05:58, Richard Pyle wrote:
Between actually working on developing ZooBank, and writing EPICALLY long
emails to Taxacom, I've neglected to keep this community abreast of progress
that's been happening with ZooBank.
To see what's been going on with ZooBank since the launch of the "new &
improved" version last September, you can take a look at three progress
reports that are available online:
First Month after launch:
Fourth Quarter, 2012:
First Quarter, 2013:
(All of these reports, and all future reports, are linked from the bottom of
the "About" page in ZooBank: http://zoobank.org/about -- and will also be
added to the ICZN site.)
As I hope is clear from these reports, the pace of ZooBank improvements is
accelerating (mostly due to Rob Whitton), and I hope (and believe) the
second-quarter report for this year will be even more full of new features
than the first-quarter report is.
Along those lines, I wanted to report on two recent (i.e., since the last
quarterly report) developments in ZooBank that may be of particular interest
to Taxacom folks (both of which have some relevance to recent threads on
data quality assurance and "crowd-sourcing" of content).
First, earlier today we launched a new feature that links ZooBank with the
Biodiversity Heritage Library. This is the first of several key features
that emerged from a workshop between representatives of GNA/BHL/IPNI/IF in
Hawaii last month.
Whenever possible, on a ZooBank page for a new species name, we now include
a link to the BHL page. Currently, we have this feature working reasonably
well for about 38,000 name records in ZooBank, and we hope to triple that
within the next month or so.
To see an example, find a name in ZooBank that was published in an older
journal likely to be in BHL (we've started linking Journal articles -- but
only a few books so far). Linnaeus, 1758 (Systema Naturae) is one of the
books we have linked, so search for your favorite name in Linnaeus to see
the new feature. Here is an example:
But here's the thing: We link most of these pages using the BHL OpenURL
service -- which is very good, but not always perfect. Thus, we've added a
new feature which we plan to expand on ZooBank over time. The feature is
signaled by a new icon with "Contribute" added to the ZooBank logo (you'll
see it above the BHL page image in the link above). This icon means "you
can help us by clicking here". If you point your mouse at the icon, you'll
see a specific message for how you can help (note: you must be logged into
you ZooBank account to provide this sort of help). In this particular case,
you can help by confirming if the OpenURL has found the correct page. Click
on the "Contribute" icon or the page image thumbnail to see a form that lets
you jump around to different pages. Once you can confirm that you're looking
at the correct page, there's a button you can click to create a direct link
to that page, after which we no longer rely on the OpenURL service, but
instead link directly to that page. If you want to go to the BHL site for
the page, click on the BHL logo below the page image thumbnail.
One of the reasons we need to crowd-source this is that there are
imperfections in the way that BHL assigns page numbers to things. In the
example of Linnaeus 1758, if you view pages prior to page 578, you'll get
the correct page image. But starting at page 578, you'll see there is a
2-page offset (pages 576 and 577 are repeated as page 578 and 579). The
example in the link above (genus Acarus) is on pg. 615, so you actually get
to the image of page 613 using the OpenURL service. What we want to do is
get a human to confirm the link as the real page 615 (which is indexed in
BHL as 617), and that the name in question really does appear on that page.
This process allows us to bypass the "page" as a number, and establish a
direct link to the BHL page identifier (which is persistent for the page
image, not for the page number). Note: this is NOT a "dis" against BHL.
It's actually AMAZING how accurate their OpenURL links are in the vast
majority of cases. However, we're striving for perfection on the
cross-linking, which is why we use the OpenURL service to get you to what is
likely the right page, but then give you, the user, the opportunity to
"crowd-source" confirm or correct the specific page link.
This is all in keeping with a basic underlying theme of the Global Names
Architecture to build cross-links between different databases, and to
harvest the passion of the broader community to crowd-source this sort of
thing. In this case, we're building a cross-link between a ZooBank record
and a BHL page. The idea is to build many, many, many more of these kinds
of links using GNA services, which we hope will be used by many, many
different websites and web-based biodiversity databases.
The other new feature in ZooBank I wanted to mention to this group relates
to Journal editors and publishers. About a week ago I sent an email to a
list of persons associated with journal publishers who have expressed
interest in access to better features on ZooBank to allow publishers to
create and maintain ZooBank registrations associated with the journals that
they edit/publish. There are two different sets of features we're working
on: one available through the existing ZooBank website, and another
involving automatic submission of content to ZooBank via XML documents
(rather than manually through the web site). The latter is still in
development, but the former is now available.
I wanted to pass this on to Taxcom to let anyone who is associated with a
journal publisher (as a publication staff member, or as an Editor) who did
not receive my email a week ago, and who is interested in accessing these
new features for publishers/editors of journals, know about the new
services, and to contact me offline for more details. Some of these new
features apply to all ZooBank Contributors; to see them, simply login to
ZooBank and click on your name at the top of the ZooBank web page banner.
I hope to be able to provide similar reports on new features in ZooBank over
the coming months.
Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
Associate Zoologist in Ichthyology
Dive Safety Officer
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom Archive back to 1992 may be searched with either of these
(1) by visiting http://taxacom.markmail.org
(2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom
your search terms here
Celebrating 26 years of Taxacom in 2013.
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
ORCID id: http://orcid.org/0000-0002-7101-9767
More information about the Taxacom