[Taxacom] A new way to view taxonomic publications

Dr Brian Taylor dr.brian.taylor at ntlworld.com
Sat Jun 22 03:46:53 CDT 2013


Rod, Donat et al,

However good OCR and scanning get there will still be problems with old
literature.  For example, archaic German - see
http://hol.osu.edu/literature-viewer.html?id=4833&page=15

On of my human helpers struggled to read it  let alone make a translation
into Emglish.

Asian scripts ??

Brian



On 22/06/2013 04:57, "Roderic Page" <r.page at bio.gla.ac.uk> wrote:

> Hi Donat,
> 
> Sent from my iPhone
> 
> On 22 Jun 2013, at 03:29, Donat Agosti <agosti at amnh.org> wrote:
> 
>> For my purpose I want to have a OCR accuracy rate between 99.9 and 99.99%
> 
> So this is the crux of the problem. You set a very high bar that BHL will
> struggle to meet in a lot of cases. This then sets limits on what you can
> achieve.
> 
> An alternative is to accept that things will be messier than that, and set
> your expectations appropriately. Plus we can think about ways to cope with
> messy text. It strikes me that there is a misplaced obsession with  "clean"
> data that gets in the way of making progress. You want the world to be one
> way, but it's the other way.
> 
> Regards
> 
> Rod
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these methods:
> 
> (1) by visiting http://taxacom.markmail.org
> 
> (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom
> your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.






More information about the Taxacom mailing list