[Taxacom] Data quality of aggregated datasets

Quentin Groom quentin.groom at br.fgov.be
Mon May 6 02:10:14 CDT 2013


Unfortunately, a common problem is that taxonomists fail to record the 
Geodetic System or Datum they are using, which leaves the aggregators to 
guess, sometime wrongly.
Worst still there have been many occasions where GPS are not set up 
correctly and give the wrong output.
Before the recent advent of the the GPS I would not guarantee the 
accuracy of any grid reference made in open country.
This also underlines the importance of collectors/recorders creating an 
accurate site name with which the grid reference can be validated.
Quentin


David Campbell wrote:
> If several kilometers error in a range doesn't seem like much (and such
> distances are quite significant for snails as well as for millipedes), how
> about the several hundred million year errors I noticed in a published
> paper that took the "what's the oldest date in the Paleobiology Database
> listed for this higher taxon" approach to calibrating their molecular
> clock?  These problems resulted from at least four causes:
> Homonym
> About 90 year old data with an unduly broad interpretation of an extant
> higher taxon
> Not knowing that early fossils related to one of the extant higher taxon of
> interest are assigned to a different, paraphyletic higher taxon
> Generally poor data for the class in question in that database-no one has
> taken on that group.
>
> Without evidence as to the quality of the data, there's no reason to trust
> the biodiversity databases nor results based on them.  Ironically, by
> failing to support the work of expert taxonomists to check the data, the
> system has produced a situation where only expert taxonomists can make much
> use of the databases, because they have the knowledge to judge what's
> reasonable and what's not.
>
>
>
> On Fri, May 3, 2013 at 8:28 AM, Poly, William <WPoly at calacademy.org> wrote:
>
>   
>> And as error-ridden data are disseminated, used in analyses, and
>> published, it will be more difficult to
>> correct and purge the errors and conclusions based on them.
>>
>>
>>
>> ________________________________________
>> From: taxacom-bounces at mailman.nhm.ku.edu [
>> taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of mesibov at southcom.com.au [
>> mesibov at southcom.com.au]
>> Sent: Friday, May 03, 2013 12:10 AM
>> To: TAXACOM
>> Subject: Re: [Taxacom] Data quality of aggregated datasets
>>
>> It's a nice visualisation, but if it leads people to think 'Aw, heck, most
>> of the records are more or less in the right place, what's the diff?',
>> then they've missed the point (and would be surprised at the sharpness of
>> millipede range boundaries). But I don't think 15%-off-by-at-least-5-km is
>> good enough.
>>
>> The point of my paper, which Rod has noted on his iPhylo blog, is that
>> aggregator-published errors need fixing, and there isn't a working
>> mechanism to do that. All the displacements except those from provider G
>> (see my paper) are fixed now at source (data provider) because I contacted
>> the sources with corrections and queries, and the sources collaborated
>> with me. That's a fix from an interested outsider, and neither GBIF nor
>> ALA were involved. The correct data are in the sources' databases and on
>> my Millipedes of Australia website. When they get to GBIF and ALA is
>> anybody's guess.
>>
>> On Taxacom, at least, GBIF and ALA are holding firm to the views that (a)
>> they aren't responsible for errors they perpetuate by publishing on the
>> Web, and (b) error detection and fixing needs to be done by Somebody Else
>> for their benefit as data publishers.
>>
>> The idea that an aggregator can check and upgrade/correct the data it
>> publishes by collaborating directly with the source is evidently
>> incomprehensible to aggregator management. No queries go back to the
>> sources, no data-cleaning protocols are insisted upon by aggregators
>> before the sources upload data, and the only error-detection and -fixing
>> mechanism in sight is vague hand-waving by aggregators about the whole
>> biodiversity community sharing the responsibility for getting things
>> right. Also not good enough.
>>
>> (Posted from the New South Wales bush, where I've been collecting
>> millipedes, of course.)
>>
>>
>> _______________________________________________
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> The Taxacom Archive back to 1992 may be searched with either of these
>> methods:
>>
>> (1) by visiting http://taxacom.markmail.org
>>
>> (2) a Google search specified as:  site:
>> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>>
>> Celebrating 26 years of Taxacom in 2013.
>> _______________________________________________
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> The Taxacom Archive back to 1992 may be searched with either of these
>> methods:
>>
>> (1) by visiting http://taxacom.markmail.org
>>
>> (2) a Google search specified as:  site:
>> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>>
>> Celebrating 26 years of Taxacom in 2013.
>>
>>     
>
>
>
>   

-- 
Dr. Quentin Groom
(Botany and Information Technology)

National Botanic Garden of Belgium
Domein van Bouchout
B-1860 Meise
Belgium

ORCID: 0000-0002-0596-5376

Landline; +32 (0) 226 009 20 ext. 364
FAX:      +32 (0) 226 009 45

E-mail:     quentin.groom at br.fgov.be
Skype name: qgroom
Website:    www.botanicgarden.be




More information about the Taxacom mailing list