[Taxacom] Data quality of aggregated datasets

Mary Barkworth Mary.Barkworth at usu.edu
Tue May 7 07:15:26 CDT 2013

>From a teaching point of view, it is much easier to explain adding a radius of uncertainty as the area within which you collected than discussing significant figures, particularly significant figures of arcane units like minutes and seconds. Even I have come to appreciate that decimal degrees are easy to think about and I have found they make a lot more sense to most students. It would also be easy to explain an uncertainty polygon but our herbarium database does not yet support recording that information. 

The basic reason that the data will always be "raw" is that we have no reliable means of communicating with the dead. When a label says Logan, Utah, I am told to use the city's current boundaries. Technically, I could look up its boundaries at the time the specimen was collected but perhaps all the collector was doing was naming the nearest settlement that he or she knew of, or the postal district, or home based for that day or week. Moreover no one is willing to pay the herbarium for the additional work required to check into alternative estimates. Actually, they are not willing to pay for anything; we (like all collections) provide the data for free. When it comes to collection data, we can adhere to standard protocols but all that provides are estimates calculated in standard way. Whether that is good enough depends on the question being addressed and the organism(s) involved. Data users should always evaluate the data they wish to use - and be grateful for the quantity being made available (gratitude can be expressed by informing the head of the collection of any errors that need fixing, mention in the acknowledgements, and an email to the head of the collection who may not otherwise know that its records have been used). 

Grids would work IF there were a standard grid that covered the whole world; lat/long is the closest thing we have to such a system. UTM, I was told when I looked into the matter, is not good for areas near the poles.



-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Robert Mesibov
Sent: Tuesday, May 7, 2013 5:43 AM
To: Quentin Groom
Subject: Re: [Taxacom] Data quality of aggregated datasets

Quentin Groom wrote:

"Why shouldn't taxonomists collect gridded data in the first place, just as ecologist have been for years?"

Many taxonomists used to do so (including me) and probably many still do, if they record locations as UTM grid squares from maps. You read or estimate from the map the easting to the left and the northing to the bottom of the site location. The grid reference thus created is a square within which the site was located. It's then a computational piece of cake to aggregate sets of these squares into larger and larger units, like 1 km squares, 2 km squares, etc., for grid-based analysis. But when I bought a GPS, I no longer needed a map, so I gave up UTM entirely. Lat/lon data are the universal currency for reporting spatial data, and I could get those directly without a UTM-to-geographic conversion.

I'm afraid I don't see your point about gridded data. Point data are fine for recording and mapping localities. They're also OK for biogeographic analysis; see my 2011 parapatry paper in ZooKeys: http://www.pensoft.net/journals/zookeys/article/1893/a-remarkable-case-of-mosaic-parapatry-in-millipedes  In non-taxonomic GIS work in the past I've had no trouble stacking point and grid data and doing analyses based on the two sorts of data. There may be ecological analyses that require gridded data only, but why do you think specimen localities should first be in that form? Aren't ecological analyses generally done at a grid scale much coarser than would be used for recording localities+'errors' as grid squares?
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and School of Agricultural Science, University of Tasmania Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
Ph: (03) 64371195; 61 3 64371195

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu

The Taxacom Archive back to 1992 may be searched with either of these methods:

(1) by visiting http://taxacom.markmail.org

(2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

Celebrating 26 years of Taxacom in 2013.

More information about the Taxacom mailing list