Geocoding Localities

Jorge Soberon Mainero jsoberon at MIRANDA.ECOLOGIA.UNAM.MX
Mon May 20 16:23:44 CDT 1996

The subject of geocoding label information is really huge. In CONABIO we
have been helping Mexican taxonomists to geocode tens and tens of
thousands of specimens, and there are all kind of unexpected problems.
There are problems with trying to decide wheter names of localities are
really different, (Chauntempan, Chiauntempan, Santa Ana Chauntempan,
Tuxpan Ver., Tuxpan Mich., Tuxpan Pue.); problems with information as "5km
north east of" means straight line, along a road, along a path? Problems
with inconsistent within-record information, like the label text specifying
the record to be whithin certain state but the geocoding turning out
to be outside it, or in  the border. And what is the border? Remember
this is a fractal concept,  depending on the scale. The precise location
of the border between two  states depend on the scale of the maps you
use. Problems of inconsistent  among-fields information, for example, the
altitude reported in the label  inconsistent with the corresponding
altitude from the digital elevation  model, or inconsistent with the
ecoregion your GIS believes the locality belongs to, etc...

In CONABIO we have a team working on the above problems, trying to
conceptualize them as well as to provide algorithms and practical
solutions. I believe this a very interesting and underdeveloped subject.
Lot of people is now georeferencing label information, which in our
experience is worth all the effort, because the interface with the GIS
allows modelling, extrapolation and all sorts of interesting analysis.
However, the exercise is fraught with methodological (and some
conceptual) problems and I would love to hear about some of the
experiences of you people out there.

