[Taxacom] A new way to view taxonomic publications

Richard Pyle deepreef at bishopmuseum.org
Fri Jun 21 23:17:25 CDT 2013


If it wasn't clear from my previous post, my main point is that we' should
stop thinking about "either / or", because clean data and dirty data are not
mutually exclusive alternatives.  We already have efforts on both fronts;
and both fronts have demonstrable value.  What we need to focus on is more
integration between them.

Rich

> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
> bounces at mailman.nhm.ku.edu] On Behalf Of Roderic Page
> Sent: Friday, June 21, 2013 5:58 PM
> To: Donat Agosti
> Cc: <taxacom at mailman.nhm.ku.edu>; David.King
> Subject: Re: [Taxacom] A new way to view taxonomic publications
> 
> Hi Donat,
> 
> Sent from my iPhone
> 
> On 22 Jun 2013, at 03:29, Donat Agosti <agosti at amnh.org> wrote:
> 
> > For my purpose I want to have a OCR accuracy rate between 99.9 and
> 99.99%
> 
> So this is the crux of the problem. You set a very high bar that BHL will
> struggle to meet in a lot of cases. This then sets limits on what you can
> achieve.
> 
> An alternative is to accept that things will be messier than that, and set
your
> expectations appropriately. Plus we can think about ways to cope with
messy
> text. It strikes me that there is a misplaced obsession with  "clean" data
that
> gets in the way of making progress. You want the world to be one way, but
> it's the other way.
> 
> Regards
> 
> Rod
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> 
> (1) by visiting http://taxacom.markmail.org
> 
> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.





More information about the Taxacom mailing list