[Taxacom] Chameleons, GBIF, and the Red List
franciscojcabezas at gmail.com
Tue Aug 26 04:01:45 CDT 2014
El lunes, 25 de agosto de 2014, Richard Pyle <deepreef at bishopmuseum.org>
> What you describe is EXACTLY what GBIF and others in the biodiversity
> informatics world are hoping to achieve.
> My previous post (reply to Stephen) covered these parts of the process:
> 1) Data exist in thousands of databases around the world
> 2) Aggregators like GBIF make our lives MUCH easier in helping us to
> discover those data
> 3) We, the experts of the world, spend hours "cleaning" data after GBIF
> has so helpfully allowed us to locate it.
> What you're talking about is the next step:
> 4) After we, the experts of the world, have spent hours "cleaning" the
> data, how do we allow those efforts to propagate back to the sources, so
> that the NEXT person who encounters those records through GBIF can benefit
> from the toils of us experts?
> There are two basic roadblocks to achieving this final step.
> First, as has been made ABUNDANTLY clear in this thread, the data do NOT
> belong to GBIF. They belong to the hundreds (thousands?) of institutions
> around the world that manage those thousands of databases. Ultimately,
> those corrections have to find their way back to the source databases, so
> that GBIF can re-index them with the corrections included. And believe me,
> GBIF and others have tried to do this EXTENSIVELY -- for many years. A lot
> of the mechanisms are being developed (e.g., FilteredPush), but so far
> there has been slow adoption of those mechanisms by the thousands of source
> databases. There are many reasons for this, but I suspect the main reason
> is that institutions are barely keeping up day-to-day activities with
> ever-shrinking budgets, and simply do not have time or IT expertise to
> implement the corrections to the datasets that they manage. Thus, because
> the source data remain "unclean", the aggregated data in GBIF remains
> The second major roadblock is the lack of "proper" identifiers (globally
> unique, persistent, actionable) for these occurrence records. The only way
> that corrections that you make in your downloaded copy of GBIF data is if
> you can report back on exactly which records need cleaning (along with the
> corrected information). GBIF does assign its own locally unique identifier
> (integer), which could be used for this purpose -- but only for piping the
> data back to GBIF. GBIF can relay the corrections back to the source
> databases, but that will only be helpful to the rest of us if the source
> incorporates the fixes.
> There is actually a third roadblock, which has the potential to become a
> major roadblock, but we haven't bumped into it yet so much because we still
> can't get past the first two roadblocks. And that is, institutions will not
> automatically assume that every "correction" that is sent to them is
> actually "correct". Managers of those data will in almost all cases want
> to review the changes to ensure that they are appropriate for updating in
> the source database. And this process, of course, requires time and
> resources that most institutions simply do not have.
> There may be another solution, however, which is for GBIF to cache
> corrections submitted by people like you and other experts, such that these
> annotations/corrections can be made visible to all users of GBIF data; not
> just the source datasets. Perhaps this feature already exists. Perhaps
> the politics of implementing such a feature are too daunting to overcome.
> But the bottom line is that we really do need to address this fourth step,
> so that we can more effectively benefit from the work of others, and
> (conversely), so that our own efforts will benefit more than just ourselves.
> > -----Original Message-----
> On Behalf
> > Of Bob Mesibov
> > Sent: Sunday, August 24, 2014 11:28 AM
> > To: Donat Agosti
> > Cc: TAXACOM; quentin groom
> > Subject: Re: [Taxacom] Chameleons, GBIF, and the Red List
> > Donat Agosti wrote:
> > "I feel, the discussion is too much centered on data that has not the
> > information content needed, like studying a Landsat image at 30 meter
> > resolution and discussing what tree species is shown"
> > Excellent metaphor! For most scientific uses, you need much more data
> > is provided by any available database. Can you get everything you need
> > online? No. Do existing aggregators like GBIF offer a helpful starting
> > For some people and some uses, yes.
> > But now the important question: when you have all the information you
> > need, and clean it and enrich it, do you publish it online in a usable
> form? I
> > don't know what Quentin Groom's project was about, nor do I know if he
> > published his final data.
> > In my own case, every one of my 12123 locality records for Australian
> > Millipedes is freely available in CSV format (and in abbreviated form in
> > from the 'Millipedes of Australia' website. This store is larger and
> more up to
> > date and contains fewer errors than any aggregator store, or even, the
> > combined data providers' stores (because certain providers have been slow
> > to add my edits to particular records, or to upload them to their own or
> > aggregator stores).
> > But if people like me and Quentin publish data freely to the Web and
> > aggregators don't use this improved/extended data, aggregation looks less
> > and less useful.
> > --
> > Dr Robert Mesibov
> > Honorary Research Associate
> > Queen Victoria Museum and Art Gallery, and School of Land and Food,
> > University of Tasmania Home contact:
> > PO Box 101, Penguin, Tasmania, Australia 7316
> > (03) 64371195; 61 3 64371195
> > _______________________________________________
> > Taxacom Mailing List
> > http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> > The Taxacom Archive back to 1992 may be searched at:
> > http://taxacom.markmail.org
> > Celebrating 27 years of Taxacom in 2014.
> Taxacom Mailing List
> The Taxacom Archive back to 1992 may be searched at:
> Celebrating 27 years of Taxacom in 2014.
More information about the Taxacom