[Taxacom] Occurrence data...
stephen_thorpe at yahoo.co.nz
Fri Feb 18 17:00:16 CST 2011
and for a *really radical* suggestion from me: why not let taxonomists work out
the occurrence data and publish it with the relevant taxonomic revision? That
way, we get data attached to robustly revised taxa, and the taxonomist can
discover any likely mislabellings, etc., by the way that they stand out as
outliers relative to the aforementioned robust taxonomic revision.
perhaps, though, what we want is all the occurrence data for all taxa (revised
and unrevised, mislabelled or not) in one giant "Christmas present" ... yeah,
From: Bob Mesibov <mesibov at southcom.com.au>
To: L Penev <lyubo.penev at gmail.com>
Cc: TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Sat, 19 February, 2011 11:42:51 AM
Subject: Re: [Taxacom] Occurrence data...
It sounds like your response to my comment
"A barrier to be overcome if DCAs are to appear more often in publications is
that most data creators are either unfamiliar with the TDWG scheme for
classifying and formatting data items, or are unwilling to spend time working
out how their own preferred data fields relate to that scheme."
"Naturally, we are aware that at the present stage DwC-A would in many cases
need some support from experienced data managers to be properly implemented. It
will take some time. On the other side, the future comes often faster than
anyone would expect. Data managers become quickly wanted job positions even in
not that large taxonomic institutions. Individual taxonomist will be facilitated
by tools to export their datasets in DwC-A or in another interoperable formats."
But this avoids the questions: is it necessary? is it even desirable? ZooKeys
already semantically marks up the text and assigns the all-important LSIDs. You
are now encouraging authors to go to the next stage, and structure their raw
occurrence and nomenclatural data. How long will it be before you ask authors to
digitally map their images, so that some aggregator ('Encyclopedia of
Morphology' project) can pull up all the hind-leg tarsus image-elements in the
digitised insect literature?
I am concerned that what is happening is flawed at two levels. First and
foremost, there is a legacy feeling from the days of libraries, when you could
create a single authoritative index and it would sit on a shelf in the Reference
section, and it was the first place you went as an introduction to a topic. You
can still find such things on the Web: lists of links, generally way out of
date. There is far too much information on the Web to make this viable, there
are too many data quality issues and updating is haphazard. The alternative is
to let software find things for you - the Rod Page approach - so that there are
as many indexes and compendia as there are occasions on which someone goes
data-hunting. And to link (or allow software to link for you) and link again,
until you have a densely interconnected network of data sources to facilitate
The second level is that even today, 20 years into the new age, promoters of
Gigantic All-Encompassing Biodiversity Databases (and indexes, Rich) still have
no clear idea who wants the information and for what purposes. If I ask that
question I sometimes get the sincere but vacuous answer that we don't know and
it isn't important, the important thing is to have the data ready when someone,
somewhere, wants it for some purpose. I can't think of any other major human
enterprise that tolerates such vagueness in its aims.
The many bottom-up biodiversity databases on the Web typically have an audience
in mind, namely the specialists who contribute to their creation, and who are
the primary users of the data. They've been structured for those users, built
with careful attention to detail, and can be 'handed down' from volunteer
specialist to volunteer specialist, with some confidence that the same general
aims and devotion will also be handed down. I don't think you could say that for
any of the aggregation projects.
I see these bottom-up resources as high-use nodes in the future networks of
linked biodiversity data. Their contents don't need to be aggregated, indexed,
repackaged or otherwise fooled with. They can be accessed directly in an
anarchic, unstructured Web. Like Pete DeVries, I don't see any good reason why
the same can't be true for raw data. If raw data is made available this way, as
in ZooKeys supplements, I'd prefer it *wasn't* marked up, so that I - as *user*,
not aggregator - can pass an eye over it a la Chris Thompson.
Rich Pyle wrote (as I was writing the above):
"Criticize aggregators all you want, but one thing that they certainly *can*
help with is in eliminating a lot of redundant effort."
Effort by whom? For what purpose? Do you really expect or want to have the
background on every RCL Perkins collection in Hawaii and every other collector
in every other place on Earth in another gigantic index-on-the-shelf? With no
errors? How about just putting on the Web the individual results of careful
scholarship and allowing *users* to find them through linking? Isn't the aim to
connect user with datum, not to keep programmers and data managers employed?
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Zoology, University of Tasmania
Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
Ph: (03) 64371195; 61 3 64371195
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom archive going back to 1992 may be searched with either of these
Or (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom
your search terms here
More information about the Taxacom