[Taxacom] Data quality in aggregated datasets

Dean Pentcheff pentcheff at gmail.com
Sun Apr 21 17:24:38 CDT 2013


http://wiki.filteredpush.org

There is much more than silence, but making a working system takes both an
initial effort and changes in the way provider systems work. It will take
time to take effect.

-Dean
-- 
Dean Pentcheff
pentcheff at gmail.com
dpentche at nhm.org


On Fri, Apr 19, 2013 at 3:04 PM, Robert Mesibov <mesibov at southcom.com.au>wrote:

> There have been occasional grumblings here on Taxacom about data quality
> in the aggregator world, e.g. in GBIF, but what would happen if you
> methodically audited a sample of aggregated species occurrence records?
> What sorts of errors would you find? Would they be rare? Frequent?
>
> I've done an audit of this kind for Australian millipede records in GBIF
> and the Atlas of Living Australia (ALA) and published the results in
> ZooKeys: http://www.pensoft.net/journals/zookeys/article/5111/a-specialist
>
> The audit results can't be generalised to all taxa and all parts of the
> world, but they're pretty disappointing. GBIF and ALA, however, disclaim
> all responsibility for data problems. If there's an error, it's the fault
> of the data provider. So how do errors in online databases get discovered
> and fixed?
>
> In this particular case, an interested third party (me) finds problems and
> alerts the data provider directly. The data provider fixes the errors and
> in the fullness of time sends corrected records to the aggregator.
> (Although I found evidence that erroneous records can persist through an
> update.)
>
> What about aggregated datasets in general? What mechanisms are there for
> detecting and fixing errors besides (interested third party) > (data
> provider) > aggregator?
>
> [Long silence.]
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Agricultural Science, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> Ph: (03) 64371195; 61 3 64371195
>
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as:  site:
> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>
> Celebrating 26 years of Taxacom in 2013.
>



More information about the Taxacom mailing list