deepreef at BISHOPMUSEUM.ORG
Mon Jan 3 20:00:34 CST 2005
> We are playing with image manipulation
> functions in php to dynamically/interactively crop and navigate the
> page image so that the current record is at the top of the image just
> under the converted data record.
Yeah, that's exactly what I figured the tricky bit would be. I gather that
there's no easy eay to deduce the physical Y-mapping of each name on each
page based on the name-sequence alone, because not all names occupy the same
number of lines.
> It's interesting to note that the conversion service we chose that
> purported to do double-keying actually initially used OCR tools as
Yeah, that was pretty evident after a quick glance of Page 1. I'm surprised
they even pretended they didn't use it. It's pretty damn hard to
"accidentally" enter a Copyright symbol, when one intends to type in a "e"
character. Other characters like "A", "?", "$" aren't exactly common
mis-type characters either...
More information about the Taxacom