Georeferencing of collecting localities

Doug Yanega dyanega at POP.UCR.EDU
Thu Apr 10 10:59:09 CDT 2003

Brad Hubley wrote:

>One of the areas that requires attention is the georeferencing of
>collecting locali
>ties.  For some disciplines, georeferencing is simply done with the
>aid of topographical maps, others are able to use Natural Resources
>Canada's Canadian Geographical Names website.  Can anyone recommend
>some other more efficient methods to assist us in o
>ur endeavours?

It is unfortunately true that georeferencing of legacy data is an
unavoidably labor-intensive process. Realistically, even if one can
automate the process of looking up a place name, a human being
absolutely MUST go through and check the output, if only to designate
an appropriate error value for each set of coordinates. For example,
if I have a specimen from Anza, California, this is a tiny town that
occupies - being generous - only one minute of latitude and
longitude; the error associated with the look-up is therefore
extremely small, and based largely on the assumption that the
specimen could be from any site within 3 km of the town itself
(anything much farther afield, and the collector would probably have
made note of it). If I have another labeled as being from Los
Angeles, then the possible range of values is enormous, and I'd be
lucky if any value generated by an automated look-up is within a full
*degree* of latitude or longitude of where the specimen actually came

I routinely georeference specimens, and have over 1400 localities
entered in our specimen database; I make every effort to use a map
AND get latitude and longitude to the nearest minute, but there are
many records which simply defy accurate georeferencing, if only
because they refer to a large and/or irregular geographical feature
(e.g., a lake, a canyon, a river). For the most part, then, I believe
we simply have to live with very large error values, and make sure
that our databases include these, to warn users of the data of the
inherent limitations. In an ideal system, each record would link to a
graphic with a map showing a blob of the appropriate dimensions -
trying to reduce every locality record to a single point is a
hopeless enterprise.

Lest people think that modern technology has this problem licked,
even the use of a GPS unit can be extremely misleading, but in the
opposite manner; I've been on numerous collecting expeditions where
the vehicle will pull off to the side of the road, one person gets
out and gets a GPS reading accurate to the nearest meter, and people
then scatter in all directions and collect at various points anywhere
up to a km (or sometimes more) away. Using that meter-precision GPS
reading for the locality data is essentially inappropriate, unless
the database also records an error value of, say, +/- 2 km. This is
why I don't like decimal readouts of GPS coordinates; if someone
writes down "33.7235616 degrees north, 117.9237152 degrees west", are
we really supposed to believe that the specimen actually was
collected in that given square centimeter? That's an absurd degree of
precision, if you think about it - but the absurdity is only obvious
if you use degrees, minutes, and seconds. The problem is that decimal
coordinates are not *intuitive*, in the sense that the meaning (in
distance) of each successive decimal place is not evident to the
average person, as in the preceding example, or, conversely, if
someone arbitrarily rounds the decimal values to, say, 33.72 N and
117.92 W, which might be inadvertantly creating too *large* an error
(.01 degrees is roughly 1 km) simply because they didn't sit down and
calculate the conversion for their rounding error. On top of that,
people are prone to making bonehead errors when using GPS units set
to decimal output; I've seen many labels with nonsense like "34
degrees, 85 minutes N" (when the field notes read "34.85 N").

The one modern tool I'm still looking for is an online resource where
you can pull up detailed maps of anywhere in the world, move the
mouse around on the map, and get the exact lat/long values AND
elevation of the point where the mouse is. The closest we have
presently are a few sites where you can get lat/long coordinates
alone, with a significant error (you can move the mouse a fair
distance before the readings change), and only for a limited
geographic area. If we can ever get a truly topographic, global map
resource, that WILL greatly speed up the process of specimen
georeferencing. Even if so, ultimately, georeferencing is NEVER going
to be an *easy* task - unless we simply don't care about accuracy.

Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California - Riverside, Riverside, CA 92521
phone: (909) 787-4315 (standard disclaimer: opinions are mine, not UCR's)
   "There are some enterprises in which a careful disorderliness
         is the true method" - Herman Melville, Moby Dick, Chap. 82

