[Taxacom] GBIF progress

Robert Guralnick Robert.Guralnick at colorado.edu
Tue Jan 6 13:46:25 CST 2009


Dear Dora, Taxacomers ---

Dora makes some great points here.  Regarding Biogeomancer, I think it
is a huge step forward (although I am probably somewhat biased having
been involved, along with many others, in development of the
application).   However it still puts a lot of burden on individual
providers.  It seems unreasonable to expect data providers to deal in
a timely manner with data quality issues completely on their own.
Manually cleaning data and improving data quality can be a slow
process, especially when each provider must learn the current
standards and best practices employed in the field.  As well, many of
the data vetting operations are repetitive in nature, so it is
inefficient for each data provider to independently perform tasks
necessary to improve data quality.

A partial solution to this challenge of large-scale georeferencing and
potentially other data quality challenges will be to not only provide
automation for one step of the process (e.g. conversion process of
textual or other locality formats to geographic coordinates) but to
create pipelines where those tools are constantly operating on the
growing set of biodiversity occurrence records as they become
available.  This pipeline can be developed to automatically feed
digital occurrence data from providers to the Biogeomancer service for
georeferencing.  The results of the process are stored for providers
to review and fold back into their original records.  In coordination
with GBIF and BioGeomancer, this pipeline is already being developed
(http://biodiversity.colorado.edu/bgb/).

In its finished state, tools such as really simple syndication (RSS)
for human users and web services (e.g. REST or SOAP) for programmatic
access will provide automatic notification of progress. By utilizing
such a system, both the georeferencing rate and number of adequately
processed records of the world's biodiversity data will increase
exponentially, thus allowing much faster use of more data in
biodiversity research.

Thoughts?

Best regards, Rob Guralnick
Dr. Rob Guralnick
Associate Professor and Curator
Dept. of Ecol. and Evol. Biol.
CU Museum of Nat. Hist.
University of Colorado Boulder
Boulder CO 80309-0265
http://robgur.googlepages.com




More information about the Taxacom mailing list