[Taxacom] Towards a consensus higher classificationoforganisms(was: List of Orders of the world), misspellings, etc...

Paul Kirk p.kirk at cabi.org
Mon Jun 23 18:24:42 CDT 2008

we scan at fairly high resolution to TIFFs but batch process down to JPG at about 100k for faster rendering in the browser. TIFFs can be OCR'd later - Omnipage can batch process - and almost certainly TIFFs can also be batch processed to PDF. I supposed that some publishers might release low resolution (and thus not easily OCRable) jpg from their digitizing efforts which could be useful as human readable JPGs (I the context referred to below) very rapidly. Anyone like to take this idea up with JSTOR?


From: taxacom-bounces at mailman.nhm.ku.edu on behalf of Jim Croft
Sent: Mon 23/06/2008 23:49
To: Jerry Cooper
Cc: Taxacom
Subject: Re: [Taxacom] Towards a consensus higher classificationoforganisms(was: List of Orders of the world), misspellings, etc...

> IndexFungorum (and the decades of printed indexes on which it is based)
> is a very good example of the index that Rod describes (and IPNI isn't).
> The fact that protologues are linked to IndexFungorum as jpegs of page
> scans, as opposed to OCR'd documents, is therefore largely irrelevant.
> From a nomenclatural standpoint the combination of the name index
> and the page scans satisfies most needs.

We have been experimenting with this as part of the Australian Plant
Name Index and to our surprise found it was possible to sort of OCR
the document as it was being PDFed so instead of a just a graphic you
ended up with a facsimile that was sort of searchable on the text.  We
were looking for an escape route that would enable the protologue to
be parsed and endatabased some time in the future when we had the
time, the staff and the technology.  We were able to convince
ourselves that making the pdf's was not a sunk investment of time
because the text could indeed be extracted for when we needed it.

Now all we need is a bunch of slaves chained to scanners and a library
starting at 1753...


Taxacom mailing list
Taxacom at mailman.nhm.ku.edu

The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited. 

Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.

If you have received this communication in error, please notify us by e-mail at cabi at cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.

CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.


More information about the Taxacom mailing list