[Taxacom] These discussions about GBIF
stephen_thorpe at yahoo.co.nz
Fri Aug 22 19:37:25 CDT 2014
I still maintain that GBIF was purpose built for a particular group of alien life forms, called 'crats, who simply want something official looking to cite as part of their jobs in local and national government ...
On Sat, 23/8/14, Bob Mesibov <mesibov at southcom.com.au> wrote:
Subject: [Taxacom] These discussions about GBIF
To: "Rod Page" <r.page at bio.gla.ac.uk>
Cc: "TAXACOM" <taxacom at mailman.nhm.ku.edu>
Received: Saturday, 23 August, 2014, 12:28 PM
...here and on Rod Page's iPhylo
blog, aren't getting very far, as usual. In fact, the best
summary of what I see as the key issues comes from Rod Page
in his recent iPhylo comments, to wit:
"Part of the problem, I suspect, is that while aggregators
may feel that a global database is by definition a good
thing, it's not at all obvious to everyone else."
"We don't make it easy to get data directly into GBIF, which
has a cumbersome, hierarchical data submission process, and
no mechanism for data citation. We should be honking about
ways to make submission of expert-curated data a no-brainer
so that articles that use GBIF data do not end up becoming
articles about the quality of that data."
"GBIF doesn't have an easy mechanism for people to directly
contribute expert-curated data, nor does it provide a
mechanism for making such data citable (and hence give
contributors metrics on how the data they've contributed is
being used). I think part of the problem with the "moral"
argument for data sharing is that it also happens to benefit
the aggregator that says "it's your duty to share". The
benefits for those doing the sharing are less obvious, so
it's in the interests of the aggregators to ensure there are
real, tangible benefits to sharing."
So while on the one hand you have people like Stephen Thorpe
and the chameleon folk saying GBIF is pretty useless (but
see my Taxacom posts about Casual vs Skeptical Users), you
have others in the biodiversity informatics community
spruiking some new shiny API under development (see iPhylo)
that'll cure the malaise and make everyone happy to fix
everyone else's data. From the middle ground (e.g. GBIF
director Donald Hobern) we get platitudes.
The core failure of the aggregators since the start of
aggregation in the 1990s has been an unwillingness to
understand why and how people look for biodiversity data.
Digital compilations began much earlier as expert-driven
projects for expert uses. The ease of using, checking and
updating such compilations made them clearly superior to
anything on paper. Compilations like these have since gone
online as what I've been calling 'bottom-up' resources, and
what Hobern calls 'expert-managed silos'. ('Silos' because
users can't directly contribute.) Their numbers continue to
increase. They're used by the same customers who looked for
authoritative paper sources in the 1980s.
The aggregators' mistake was to try to scale this up to
include all biodiversity and to offer single-portal Web
interfaces for searching, querying and analysing all
biodiversity data. Yes, it *could* be done, but for whom and
for what purposes? Has any aggregator ever done any
marketing research? It's a new product, its development
costs millions and you just throw it into the marketplace
and hope someone buys it, because *you* think it's a good
idea? And you're disappointed that everyone isn't dropping
whatever else they're doing to make it bigger and shinier?
The customers still want information about specifics:
particular taxa and particular places. They want some
assurance that the information is correct and up to date.
Their best search strategy is to look on the Web to see
what's available. If the choice is between an expert-driven
project for expert uses whose builders can be directly
contacted, and an aggregator with less data, lower data
quality and (after what? 10 years?) no effective feedback to
compilers — which is the better choice?
GBIF is just another silo. It's bigger than any other but
its size has been achieved at the cost of data quality, and
it's still a long, long way from complete and up to date.
Tinkering around the edges with new APIs doesn't fix the
core problem, any more than spending big bucks on
advertising will sell a dud product. Expert-driven projects
for expert uses from contactable experts will be around for
as long as the Web makes their distribution cheap and easy.
To work effectively (no platitudes, please) towards its
daydream, GBIF needs to spend many more millions, and will
still wind up with mostly unvetted data, just as EoL will
wind up with mostly empty pages.
If half the money that GBIF has gobbled up could have gone
to data providers to employ data curators or to develop data
curation programs, GBIF would be a useful source for
information not available from expert-managed silos. It
didn't, and it won't. Why am I not surprised?
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Land and Food, University of Tasmania
PO Box 101, Penguin, Tasmania, Australia 7316
(03) 64371195; 61 3 64371195
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom Archive back to 1992 may be searched at: http://taxacom.markmail.org
Celebrating 27 years of Taxacom in 2014.
More information about the Taxacom