[Taxacom] Data quality in aggregated datasets

Dean Pentcheff pentcheff at gmail.com
Sun Apr 21 17:24:38 CDT 2013


There is much more than silence, but making a working system takes both an
initial effort and changes in the way provider systems work. It will take
time to take effect.

Dean Pentcheff
pentcheff at gmail.com
dpentche at nhm.org

On Fri, Apr 19, 2013 at 3:04 PM, Robert Mesibov <mesibov at southcom.com.au>wrote:

> There have been occasional grumblings here on Taxacom about data quality
> in the aggregator world, e.g. in GBIF, but what would happen if you
> methodically audited a sample of aggregated species occurrence records?
> What sorts of errors would you find? Would they be rare? Frequent?
> I've done an audit of this kind for Australian millipede records in GBIF
> and the Atlas of Living Australia (ALA) and published the results in
> ZooKeys: http://www.pensoft.net/journals/zookeys/article/5111/a-specialist
> The audit results can't be generalised to all taxa and all parts of the
> world, but they're pretty disappointing. GBIF and ALA, however, disclaim
> all responsibility for data problems. If there's an error, it's the fault
> of the data provider. So how do errors in online databases get discovered
> and fixed?
> In this particular case, an interested third party (me) finds problems and
> alerts the data provider directly. The data provider fixes the errors and
> in the fullness of time sends corrected records to the aggregator.
> (Although I found evidence that erroneous records can persist through an
> update.)
> What about aggregated datasets in general? What mechanisms are there for
> detecting and fixing errors besides (interested third party) > (data
> provider) > aggregator?
> [Long silence.]
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Agricultural Science, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> Ph: (03) 64371195; 61 3 64371195
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> (1) by visiting http://taxacom.markmail.org
> (2) a Google search specified as:  site:
> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> Celebrating 26 years of Taxacom in 2013.

More information about the Taxacom mailing list