[Taxacom] Ghost genus? (Stephen Thorpe)

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Fri Jun 21 18:46:52 CDT 2013


You seem to be vascillating between human error and automation error. We can expect more of the latter from some projects ...



________________________________
From: "Tony.Rees at csiro.au" <Tony.Rees at csiro.au>
To: stephen_thorpe at yahoo.co.nz; jsohn at umd.edu; taxacom at mailman.nhm.ku.edu 
Sent: Saturday, 22 June 2013 11:38 AM
Subject: RE: [Taxacom] Ghost genus? (Stephen Thorpe)



Hi Stephen – a mix-up? Certainly; terrible? Probably not – a simple human error somewhere – you will find them in even the best data compilations. The formatting was most likely automatically applied to whatever was supplied in the “genus” field so nobody’s especial fault.
 
I cannot remember exactly where, but one taxonomic compilation I recall had a disclaimer along the lines of “this dataset was assembled not by God but by humans and therefore will almost certainly contain errors.” The important thing is that data are exposed to scrutiny (the more eyes the better) and that mechanisms are then supplied for feedback to relevant persons when errors are discovered (or as you will no doubt say in the wikiXXX case, you can fix them yourself…)
 
Best regards - Tony
 
From:Stephen Thorpe [mailto:stephen_thorpe at yahoo.co.nz] 
Sent: Saturday, 22 June 2013 9:24 AM
To: Rees, Tony (CMAR, Hobart); jsohn at umd.edu; taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Ghost genus? (Stephen Thorpe)
 
Yes, it is probably "No Assigned Genus", or equivalent, but very bad form (a terrible mix up of taxonomy and nomenclature). In these days of such generic names as Bob, or Do, one called Nag is quite plausible, and putting it in italics with the author of the species in parentheses is *really* misleading!
 
Stephen
 
From:"Tony.Rees at csiro.au" <Tony.Rees at csiro.au>
To: jsohn at umd.edu; taxacom at mailman.nhm.ku.edu 
Sent: Saturday, 22 June 2013 11:06 AM
Subject: Re: [Taxacom] Ghost genus? (Stephen Thorpe)

Dear all,

Jay's suggestion sounds plausible to me.

Of course there are many adventures to be had in the world of phantom genus and/or species names. One favourite(?) was chasing down the supposed genus name "Notliripora" in a well known compilation which shall be nameless (the name will still turn up via Google searches, having propagated from there in a limited manner since). It did not appear to be a misspelling for anything I could trace. The nearest was "Liripora" which got me to imagine the scenario by which this had ended up interpreted as a generic name on a specimen label which is presumably still out there somewhere...

Tony Rees
Manager, Divisional Data Centre,
CSIRO Marine and Atmospheric Research,
GPO Box 1538,
Hobart, Tasmania 7001, Australia
Ph: 0362 325318 (Int: +61 362 325318)
Fax: 0362 325000 (Int: +61 362 325000)
e-mail: Tony.Rees at csiro.au
Manager, OBIS Australia regional node, http://www.obis.org.au/
Biodiversity informatics research activities: http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
LinkedIn profile: http://www.linkedin.com/pub/tony-rees/18/770/36

> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
> bounces at mailman.nhm.ku.edu] On Behalf Of Jae Cheon Sohn
> Sent: Saturday, 22 June 2013 3:16 AM
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] Ghost genus? (Stephen Thorpe)
> 
> Hello. I had the same question before. I checked the name in the
> Zoological Records and found no hit.
> My guess is that this name came from an abbreviation 'NAG' meaning "no
> available genus". It was then inappropriately used as a genus name. Let
> me know if someone found this genus name available.
> 
> regrds
> Jay Sohn
> 
> 
> 
> ________________________________________
> From: taxacom-bounces at mailman.nhm.ku.edu [taxacom-
> bounces at mailman.nhm.ku.edu] on behalf of taxacom-
> request at mailman.nhm.ku.edu [taxacom-request at mailman.nhm.ku.edu]
> Sent: Friday, June 21, 2013 1:00 PM
> To: taxacom at mailman.nhm.ku.edu
> Subject: Taxacom Digest, Vol 87, Issue 25
> 
> Send Taxacom mailing list submissions to
>        taxacom at mailman.nhm.ku.edu
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> or, via email, send a message with subject or body 'help' to
>        taxacom-request at mailman.nhm.ku.edu
> 
> You can reach the person managing the list at
>        taxacom-owner at mailman.nhm.ku.edu
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Taxacom digest..."
> 
> 
> Today's Topics:
> 
>    1. Ghost genus? (Stephen Thorpe)
>    2. Re: A new way to view taxonomic publications (Dave Roberts)
>    3. david r smith (stuartf)
>    4. Re: david r smith = still active (Chris Thompson)
>    5. Re: david r smith = still active (Sharkey, Michael J)
>    6. Re: A new way to view taxonomic publications (David.King)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 20 Jun 2013 22:28:42 -0700 (PDT)
> From: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
> Subject: [Taxacom] Ghost genus?
> To: "taxacom at mailman.nhm.ku.edu com" <taxacom at mailman.nhm.ku.edu>
> Message-ID:
>        <1371792522.53461.YahooMailNeo at web161901.mail.bf1.yahoo.com>
> Content-Type: text/plain; charset=iso-8859-1
> 
> http://www1.ala.org.au/gallery2/main.php?g2_itemId=29796
> ?
> Does anyone know where the genus name Nag comes from? I cannot find it
> anywhere other than here ...
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 20 Jun 2013 17:17:50 +0100
> From: Dave Roberts <workpackage6 at gmail.com>
> Subject: Re: [Taxacom] A new way to view taxonomic publications
> To: Roderic Page <r.page at bio.gla.ac.uk>
> Cc: taxacom taxacom <taxacom at mailman.nhm.ku.edu>
> Message-ID: <DB41BE64-5D6E-4460-A355-4071DC6DB9F0 at gmail.com>
> Content-Type: text/plain; charset=windows-1252
> 
> Dear Rod,
> 
> I agree and I was not underestimating the chances of being lucky that
> someone has released some specific paper.  The NHM journals are all in
> BHL, for example..  I am constantly impressed with the rapid growth in
> digital copies [of old materials] that are freely available.  Its often
> worth repeating a search that came up empty before.
> 
> My point was that BHL was forced into a series of ad-hoc agreements
> with individual publishers that, in many cases, did not bring the
> economies of scale.  Quentin's proposal was about that industrial
> scale.  If you have to pick and choose your journals, or worse,
> articles, then the scale just isn't there and we're probably better off
> with small projects focussing on stuff that are priorities for that
> project.
> 
> I indicated support Quentin's idea, but its important to recognise the
> bottlenecks and he resources that will be needed to relieve them.
> 
> Cheers, Dave
> --.
> On 20 Jun 2013, at 13:40, Roderic Page <r.page at bio.gla.ac.uk> wrote:
> 
> > Dave,
> >
> > 1. BHL has a LOT of post 1923 content, much of it provided by member
> institutions. The bulk of the major US museum in-house publications are
> in BHL, including papers published this century. Most of the NHM's
> Bulletins are in BHL as well. There is a persistent myth that BHL has
> only "old" stuff (i.e., pre-1923) which is totally wrong.
> >
> > 2. There is a lot of literature being made available in digital
> archives across the planet (e.g., Gallica in France, CiNii in Japan,
> DSpace archives across the world).
> >
> > 3. Many "smaller" taxonomic journals are putting PDFs online.
> >
> > 4. Yes, there are massive gaps, but we're in a lot better shape than
> you might think. I estimate a user of http://bionames.org/has about a 1
> in 4 chance of getting at least one digitised original description for
> most taxa (some of which will be copyrighted)
> >
> > 5. We can wring our hands about the gaps, or we can do something
> about it. It is easy to identity the major publishers of taxonomy (see
> http://iphylo.blogspot.co.uk/2013/06/bionames-and-where-taxonomy-is-
> published.html ), why not start talking to them about what benefits
> taxonomic indexing could bring to them?
> >
> > Regards
> >
> > Rod
> >
> >
> > On 20 Jun 2013, at 13:21, Dave Roberts wrote:
> >
> >> +1
> >>
> >> but isn't that what BHL set out to do, but ran headlong into the
> copyright wall?  OK increasing amounts of modern literature are open-
> source, but that still leaves a huge gap in coverage.
> >>
> >> Cheers, Dave
> >> --
> >> On 20 Jun 2013, at 12:40, Roderic Page <r.page at bio.gla.ac.uk> wrote:
> >>
> >>> +1
> >>>
> >>> On 20 Jun 2013, at 12:21, Quentin Groom wrote:
> >>>
> >>>> Dear Rod, Donat and others,
> >>>> This is a little off topic, but it seems to me that we need a
> project on the scale of the Human Genome Project to get accurate
> transcriptions and markup of all the legacy text. At the moment there
> are many small scale trials, but to push the cost down and cover the
> vast corpus of literature we need to scale up. I'm sure that OCR could
> be improved and that automatic markup is possible, indeed we need a big
> project to give incentive to innovations on this topic. Perhaps we need
> to start pushing our funders in this direction.
> >>>> Rod, what you're doing is useful as it, at very least, shows us
> what is and might be possible.
> >>>> Regards
> >>>> Quentin
> >>>>
> >>>> Roderic Page wrote:
> >>>>>
> >>>>> I guess I'm struggling to see what we're arguing about.
> >>>>>
> >>>>> I have no issue with publishing structured documents going
> forward, and all power to ZooKeys (and sister journals) for pioneering
> this. The taxonomic literature viewer I built last night is possible
> because of Pensoft.
> >>>>>
> >>>>> However, the vast majority of animal taxa have not been published
> in this way, and until we have described >10^6 new species in
> structured documents, this will always be the case. So, what do do
> about this legacy? How do we tackle that in a way that is scalable? How
> do we do this across all taxa, not just a few select groups (ants are
> cool, but there's a LOT more to life than ants).
> >>>>>
> >>>>> Furthermore, how do we integrate the taxonomic literature with
> the broader biodiversity literature (e.g., ecology, genomics, etc.).
> How do we make taxonomic literature as findable and as accessible? When
> people publish articles in Nature, why is the taxonomic literature
> cited not linked in the same way as the other papers? Why is most of
> the taxonomic literature effectively invisible in the digital age?
> >>>>>
> >>>>> These are the things that motivate me to build things like
> BioNames, which currently has about 3.9 million names, and 400,000
> articles (of which about a quarter I've linked to some form of digital
> identifier). Is it complete, obviously not. Is it a ridiculous thing to
> attempt, of course it is. Is it useful, I hope so.
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> Rod
> >>>>>
> >>>>> On 20 Jun 2013, at 10:52, Donat Agosti wrote:
> >>>>>
> >>>>>
> >>>>>> May be Chuck will now ? that's how I recapitulate my exchanges
> along the history of BHL.
> >>>>>>
> >>>>>> I don't say treatment only ? we have most of the names already
> done in ants (see HNS), the literature has been refenced, scanned and
> linked to the citations well before BHL existed (Smithsonian supported
> the scanning as their first grant by the Atherton Seidall foundation,
> parallel to Biologia Centrali Americana and the mosiqto group), so the
> next logic step is to dig into content. And that is treatment in the
> content of names (even Linnaeus didn't just supply binomen, but they
> all are linked to a treatment), and materials citations, which are
> children of the treatment (and even Linnaeus had them).
> >>>>>>
> >>>>>> May be one should turn it around. You work like crazy to
> discover articles in a massive body of legacy, and then dig out names.
> Essentially we in the ant world  have that and want more, and we have
> learned a lesson: This approach is extremely inefficient.  So, the
> questions I am really interested in is: how can we avoid make you work
> that much? And the only way is to promote the publishing of structured
> (eg semantically enhanced linked publications), that also include the
> elements the Linnaeus foresaw in his minimalist Systema: names AND
> treatments with some structure (description, ecology/behavior,
> distribution/ citations). And that's why there is taxpub and needs
> being promoted by good implementations like those of Pensoft.
> >>>>>>
> >>>>>>
> >>>>>> Donat
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> From: Roderic Page [mailto:r.page at bio.gla.ac.uk]
> >>>>>> Sent: Thursday, June 20, 2013 2:03 PM
> >>>>>> To: Donat Agosti
> >>>>>> Cc: taxacom at mailman.nhm.ku.edu; 'Lyubomir Penev'
> >>>>>> Subject: Re: [Taxacom] A new way to view taxonomic publications
> >>>>>>
> >>>>>> Hi Donat,
> >>>>>>
> >>>>>> Quick comment on "BHL deliberately allows access to single
> pages, because their scientists wanted to be able to link directly to
> the page of the treatment, protologue"
> >>>>>>
> >>>>>> BHL allows access to pages because that's the physical unit they
> scan, and the obvious thing to expose on their web page. They also
> didn't have an easy way of locating articles (which has been one of the
> biggest complaints about BHL). I seriously doubt it was a decision to
> enable people to link to protologues. It's the standard way you expose
> scanned literature.
> >>>>>>
> >>>>>> I get that linking to the actual treatment/nomenclatural event
> is desirable, all I'm arguing is that article-level linking is more
> tractable, and enables a bunch of things that are bigger than simply
> providing access to taxonomic names. The prize is much bigger than
> that.
> >>>>>>
> >>>>>> Regards
> >>>>>>
> >>>>>> Rod
> >>>>>>
> >>>>>> On 20 Jun 2013, at 10:15, Donat Agosti wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> From: Roderic Page [mailto:r.page at bio.gla.ac.uk]
> >>>>>> Sent: Thursday, June 20, 2013 12:34 PM
> >>>>>> To: Donat Agosti
> >>>>>> Cc: taxacom at mailman.nhm.ku.edu com; Lyubomir Penev
> >>>>>> Subject: Re: [Taxacom] A new way to view taxonomic publications
> >>>>>>
> >>>>>> Hi Donat,
> >>>>>>
> >>>>>>
> >>>>>> In the demo you skip one of the nice elements, the treatment
> element. And in my view this is a loss of quality, which you can easily
> see, when you start looking up one of the new descriptions, eg .
> Paedophryne dekot. If you look it up on the right hand in Biostor, you
> finally end up up at the article level and the treatment itself, which
> is the relevant information that I need, not the publication. Similar
> the pdf that you show in biostor has lost all the links that are in the
> original pdf.
> >>>>>>
> >>>>>> It would be straightforward to list treatments in the "Contents"
> section. The articles discovered in the reference are typically in PDF
> or image form, but obviously the next step is to display them in the
> same way, where possible.
> >>>>>> DA: this is the huge problem: How to get them out of the pdf
> (text born, or images) ? this right now does not scale to your 10^6.
> And it might be also questionable whether this underlies the big data
> paradigm, since there are not plenty of treatment covering the same but
> rather singletons, and thus errors in the conversion play a different
> role?
> >>>>>>
> >>>>>>
> >>>>>> I know, you are not interested in the treatment but names and
> publications. I as a taxonomist in the world you describe, a
> publication being a database, which happens when you convert it into
> XML, and even more so an XML that is domain specific, don't want to be
> stuck in the world or articles, just because we grew up with it. The
> goal is customized information, and that means treatment (in the
> context of a scientific publication). And with Zookeys at hand you have
> all to make this happen.
> >>>>>>
> >>>>>> Finally, most of the functionality is already in the Zookeys
> article itself when you look at it in the html version instead of the
> pdf your link to
> (http://www.pensoft.net/journals/zookeys/article/1963/at-the-lower-
> size-limit-for-tetrapods-two-new-species-of-the-miniaturized-frog-
> genus-paedophryne-anura-microhylidae-) , and even more so in their
> species profiles
> http://ptp.pensoft.eu/external_details.php?type=1&query=Paedophryne.
> The interesting step then is to get this done for other journals and
> see, how it looks like. There, the production of clean OCR or text
> extraction from PDF, the semantic mark-up is not done, that is served
> on a silver platter
> >>>>>> by Pensoft, with a pretty overhead in the production of the
> article.
> >>>>>>
> >>>>>> Two points, one minor, one not so. The minor point is that, yes
> Pensoft displays marked-up HTML, but it is surrounded by lots of
> publish-specific stuff. This is one reason why PDFs are still so
> popular, publishers can't resist surrounding HTML with junk (logos,
> links, etc.).
> >>>>>>
> >>>>>> The second point is that Pensoft has a platform for their
> journals. Great, but I want platform  that is publisher-agnostic. I
> don't care who publishes the stuff, I want a consistent way to explore
> it. I chose ZooKeys because yes, it is essentially pre-processed, so I
> can focus on just the rendering. But I want PLoS, BMC, SciElo journals
> looking like this, I want Zootaxa to look like this, etc.
> >>>>>> DA: I agree, we want to have an independent site, but we also
> need to have a business model behind (University pays, EU-pays,
> somebody pays). The consistent way is clearly the goal.
> >>>>>>
> >>>>>>
> >>>>>> Then this needs to be compared to other similar sites, like
> Species-ID http://species-id.net/wiki/Paedophryne_dekot, that allows
> the crowd to edit and add content or Plazi, the treatment repository
> http://tinyurl.com/nku2rdd, which allows to get back to the treatment
> and not end up in the article.
> >>>>>>
> >>>>>> So, the challenge has to be that Bionames does not loose
> granularity, and operates at the level we cite (not article but a page
> within an article, essentially linking to the treatment), and to show
> mechanisms to read in pdfs that have no XML, not text in fact,  and
> make this fly, something you have demonstrated to find articles within
> the BHL body. This time just one level down.
> >>>>>>
> >>>>>> I realise you want treatments, and I'm not saying I don't, but I
> have to choose the level of granularity that scales across 10^6 names
> and 10^5 publications. I need an infrastructure that enables me to make
> links between names and publications, and that means working at the
> level of articles. This is also where citation networks operate, and
> where links to other kinds of data operate (e.g., links between
> sequences, phylogenies, and publications). The reality is the
> publication is the fundamental unit we keep track of.
> >>>>>>
> >>>>>> DA: I would argue, that this is an artifact. BHL deliberately
> allows access to single pages, because their scientists wanted to be
> able to link directly to the page of the treatment, protologue. All the
> references in HNS are linked to the particular page, not the article.
> Yes  we have to cite, and we measure citations, but this is just a
> historical legacy, similar that BHL works on journals (and not
> articles, that what you are digging out with a lot of pain) is because
> of constraints outside the taxonomists control (the libraries have them
> in the stacks this way). All the microcitations point to pages, not
> articles.
> >>>>>>
> >>>>>> Now, there are interesting developments in making publications
> more granular. PLoS has had DOIs for figures for some time, and BMC
> figures are retrospectively having DOIs assigned by figshare (e.g.,
> http://dx.doi.org/10.6084/m9.figshare.34256).
> >>>>>>
> >>>>>> I would argue that if you are serious about treatments being
> citable and discoverable, you'd give them DOIs, so that, for example,
> the treatments within a ZooKeys article would have their own DOIs. It's
> time we started playing the bigger game. It's not about treatments per
> see, it's about linking citable entities.
> >>>>>> DA: We are serious, just not as fast as you. We also decided for
> the time being to make use of stable http URI for identifiers instead
> of DOIs, not least because the goal is getting this content into the
> semantic web. This follows the venue CETAF is going by using this
> system for their specimen data (see also the discussion on stable
> identifiershttps://plus.google.com/u/1/117201190352607228695/posts/RXEw
> pGWu18o)
> >>>>>>
> >>>>>>
> >>>>>> Cheers
> >>>>>> Donat
> >>>>>>
> >>>>>> Regards
> >>>>>>
> >>>>>> Rod
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Cheers
> >>>>>>
> >>>>>> Donat
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
> bounces at mailman.nhm.ku.edu] On Behalf Of Roderic Page
> >>>>>> Sent: Thursday, June 20, 2013 12:50 AM
> >>>>>> To: taxacom at mailman.nhm.ku.edu com
> >>>>>> Subject: [Taxacom] A new way to view taxonomic publications
> >>>>>>
> >>>>>> I've developed a somewhat experimental viewer for articles in
> the journal ZooKeys which might be of interest. There is a blog post
> here http://iphylo.blogspot.co.uk/2013/06/a-new-way-to-view-taxonomic-
> publications.html
> >>>>>>
> >>>>>> You can try a live example here:
> http://bionames.org/labs/zookeys-viewer/?doi=10.3897/zookeys.154.1963
> >>>>>>
> >>>>>> This viewer is one of the motivations behind http://bionames.org/
> I'm aiming for a platform where we can embed the taxonomic literature
> and have names and publications seamlessly linked together, enabling us
> to navigate through the primary taxonomic literature in a single place.
> >>>>>>
> >>>>>> Regards
> >>>>>>
> >>>>>> Rod
> >>>>>>
> >>>>>> ---------------------------------------------------------
> >>>>>> Roderic Page
> >>>>>> Professor of Taxonomy
> >>>>>> Institute of Biodiversity, Animal Health and Comparative
> Medicine College of Medical, Veterinary and Life Sciences Graham Kerr
> Building University of Glasgow Glasgow G12 8QQ, UK
> >>>>>>
> >>>>>> Email: r.page at bio.gla.ac.uk
> >>>>>> Tel: +44 141 330 4778
> >>>>>> Fax: +44 141 330 2792
> >>>>>> Skype: rdmpage
> >>>>>> Facebook: http://www.facebook.com/rdmpage
> >>>>>> Twitter: http://twitter.com/rdmpage
> >>>>>> Blog: http://iphylo.blogspot.com/
> >>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >>>>>> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> >>>>>> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> >>>>>> ORCID id: http://orcid.org/0000-0002-7101-9767
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Taxacom Mailing List
> >>>>>> Taxacom at mailman.nhm.ku.edu
> >>>>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> >>>>>>
> >>>>>> The Taxacom Archive back to 1992 may be searched with either of
> these methods:
> >>>>>>
> >>>>>> (1) by visiting http://taxacom.markmail.org/
> >>>>>>
> >>>>>> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> >>>>>>
> >>>>>> Celebrating 26 years of Taxacom in 2013.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------
> >>>>>> Roderic Page
> >>>>>> Professor of Taxonomy
> >>>>>> Institute of Biodiversity, Animal Health and Comparative
> Medicine
> >>>>>> College of Medical, Veterinary and Life Sciences
> >>>>>> Graham Kerr Building
> >>>>>> University of Glasgow
> >>>>>> Glasgow G12 8QQ, UK
> >>>>>>
> >>>>>> Email: r.page at bio.gla.ac.uk
> >>>>>> Tel: +44 141 330 4778
> >>>>>> Fax: +44 141 330 2792
> >>>>>> Skype: rdmpage
> >>>>>> Facebook: http://www.facebook.com/rdmpage
> >>>>>> Twitter: http://twitter.com/rdmpage
> >>>>>> Blog: http://iphylo.blogspot.com/
> >>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >>>>>> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> >>>>>> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> >>>>>> ORCID id: http://orcid.org/0000-0002-7101-9767
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------
> >>>>>> Roderic Page
> >>>>>> Professor of Taxonomy
> >>>>>> Institute of Biodiversity, Animal Health and Comparative
> Medicine
> >>>>>> College of Medical, Veterinary and Life Sciences
> >>>>>> Graham Kerr Building
> >>>>>> University of Glasgow
> >>>>>> Glasgow G12 8QQ, UK
> >>>>>>
> >>>>>> Email: r.page at bio.gla.ac.uk
> >>>>>> Tel: +44 141 330 4778
> >>>>>> Fax: +44 141 330 2792
> >>>>>> Skype: rdmpage
> >>>>>> Facebook: http://www.facebook.com/rdmpage
> >>>>>> Twitter: http://twitter.com/rdmpage
> >>>>>> Blog: http://iphylo.blogspot.com/
> >>>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >>>>>> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> >>>>>> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> >>>>>> ORCID id: http://orcid.org/0000-0002-7101-9767
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> ---------------------------------------------------------
> >>>>> Roderic Page
> >>>>> Professor of Taxonomy
> >>>>> Institute of Biodiversity, Animal Health and Comparative Medicine
> >>>>> College of Medical, Veterinary and Life Sciences
> >>>>> Graham Kerr Building
> >>>>> University of Glasgow
> >>>>> Glasgow G12 8QQ, UK
> >>>>>
> >>>>> Email: r.page at bio.gla.ac.uk
> >>>>> Tel: +44 141 330 4778
> >>>>> Fax: +44 141 330 2792
> >>>>> Skype: rdmpage
> >>>>> Facebook: http://www.facebook.com/rdmpage
> >>>>> Twitter: http://twitter.com/rdmpage
> >>>>> Blog: http://iphylo.blogspot.com/
> >>>>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >>>>> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> >>>>> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> >>>>> ORCID id: http://orcid.org/0000-0002-7101-9767
> >>>>>
> >>>>> _______________________________________________
> >>>>> Taxacom Mailing List
> >>>>> Taxacom at mailman.nhm.ku.edu
> >>>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> >>>>>
> >>>>> The Taxacom Archive back to 1992 may be searched with either of
> these methods:
> >>>>>
> >>>>> (1) by visiting http://taxacom.markmail.org/
> >>>>>
> >>>>> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> >>>>>
> >>>>> Celebrating 26 years of Taxacom in 2013.
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Dr. Quentin Groom
> >>>> (Botany and Information Technology)
> >>>>
> >>>> National Botanic Garden of Belgium
> >>>> Domein van Bouchout
> >>>> B-1860 Meise
> >>>> Belgium
> >>>>
> >>>> ORCID: 0000-0002-0596-5376
> >>>>
> >>>> Landline; +32 (0) 226 009 20 ext. 364
> >>>> FAX:      +32 (0) 226 009 45
> >>>>
> >>>> E-mail:    quentin.groom at br.fgov.be
> >>>> Skype name: qgroom
> >>>> Website:    http://www.botanicgarden.be/
> >>>
> >>> ---------------------------------------------------------
> >>> Roderic Page
> >>> Professor of Taxonomy
> >>> Institute of Biodiversity, Animal Health and Comparative Medicine
> >>> College of Medical, Veterinary and Life Sciences
> >>> Graham Kerr Building
> >>> University of Glasgow
> >>> Glasgow G12 8QQ, UK
> >>>
> >>> Email: r.page at bio.gla.ac.uk
> >>> Tel: +44 141 330 4778
> >>> Fax: +44 141 330 2792
> >>> Skype: rdmpage
> >>> Facebook: http://www.facebook.com/rdmpage
> >>> Twitter: http://twitter.com/rdmpage
> >>> Blog: http://iphylo.blogspot.com/
> >>> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >>> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> >>> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> >>> ORCID id: http://orcid.org/0000-0002-7101-9767
> >>>
> >>> _______________________________________________
> >>> Taxacom Mailing List
> >>> Taxacom at mailman.nhm.ku.edu
> >>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> >>>
> >>> The Taxacom Archive back to 1992 may be searched with either of
> these methods:
> >>>
> >>> (1) by visiting http://taxacom.markmail.org/
> >>>
> >>> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> >>>
> >>> Celebrating 26 years of Taxacom in 2013.
> >>
> >> --
> >> Dr D.McL. Roberts,        Tel: +44 (0)20 7942 5086
> >> ViBRANT Project Manager,
> >> Dept. Life Sciences,
> >> The Natural History Museum,
> >> Cromwell Road,
> >> London        SW7 5BD
> >> Great Britain            Email: dmr at nomencurator dot org
> >> Web page:  http://vbrant.eu/
> >> Web page:  http://scratchpads.eu/
> >> Web page:  http://www.editwebrevisions.info/
> >> --
> >> "You can't just ask customers what they want and then try and give
> it to them.  By the time you get it built, they'll want something new."
> [Steve Jobs, quoted in The Guardian, Technology Section, 25 June 09].
> >> --
> >>
> >>
> >>
> >>
> >>
> >
> > ---------------------------------------------------------
> > Roderic Page
> > Professor of Taxonomy
> > Institute of Biodiversity, Animal Health and Comparative Medicine
> > College of Medical, Veterinary and Life Sciences
> > Graham Kerr Building
> > University of Glasgow
> > Glasgow G12 8QQ, UK
> >
> > Email: r.page at bio.gla.ac.uk
> > Tel: +44 141 330 4778
> > Fax: +44 141 330 2792
> > Skype: rdmpage
> > Facebook: http://www.facebook.com/rdmpage
> > Twitter: http://twitter.com/rdmpage
> > Blog: http://iphylo.blogspot.com/
> > Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> > Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> > Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> > ORCID id: http://orcid.org/0000-0002-7101-9767
> >
> 
> --
> Dr D.McL. Roberts,        Tel: +44 (0)20 7942 5086
> ViBRANT Project Manager,
> Dept. Life Sciences,
> The Natural History Museum,
> Cromwell Road,
> London        SW7 5BD
> Great Britain            Email: dmr at nomencurator dot org
> Web page:  http://vbrant.eu/
> Web page:  http://scratchpads.eu/
> Web page:  http://www.editwebrevisions.info/
> --
> "You can't just ask customers what they want and then try and give it
> to them.  By the time you get it built, they'll want something new."
> [Steve Jobs, quoted in The Guardian, Technology Section, 25 June 09].
> --
> 
> 
> 
> 
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Fri, 21 Jun 2013 11:17:53 +0000
> From: stuartf <stuartf at knights.ucf.edu>
> Subject: [Taxacom] david r smith
> To: "entomo-l at listserv.uoguelph.ca" <entomo-l at listserv.uoguelph.ca>,
>        "taxacom at mailman.nhm.ku.edu" <taxacom at mailman.nhm.ku.edu>,
>        "ecn-l at listserv.unl.edu" <ecn-l at listserv.unl.edu>
> Message-ID:
> 
> <E5808E8413228F4384BD4FCE18D3CB1730C47BDB at SN2PRD0710MB372.namprd07.prod
> .outlook.com>
> 
> Content-Type: text/plain; charset="iso-8859-1"
> 
> good morning folks - sorry about the cross post
> 
> my how time flies
> 
> they tell me that david r smith - symphyta - smithsonian has retired
> and i have managed to loose his e-address
> 
> if anyone can supply - please do
> 
> otherwise
> 
> i am interested in contacting folks within the usa who are working with
> tenthredinidae
> 
> thanks in advance
> 
> reply off list server to cut down on the chatter
> 
> cheers!
> 
> rof
> 
> 
> Stuart M Fullerton ROF, Research Associate Arthropod
> 
> Collection (UCFC), Dept. of Biology, University of Central Florida, PO
> 
> Box 162368, Orlando, Florida, 32816-2368, USA. stuartf at knights.ucf.edu
> 
> <mailto:stuartf at pegasus.cc.ucf.edu>(407) 823-6540 (no voice mail),
> http://bugcloset.cos.ucf.edu/
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Fri, 21 Jun 2013 10:10:28 -0400
> From: "Chris Thompson" <xelaalex at cox.net>
> Subject: Re: [Taxacom] david r smith = still active
> To: "stuartf" <stuartf at knights.ucf.edu>,
>        <entomo-l at listserv.uoguelph.ca>,
> <taxacom at mailman.nhm.ku.edu>,
>        <ecn-l at listserv.unl.edu>
> Message-ID: <2AAA1D175F284660BD66660AFF651CCE at ChrisPC>
> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
>        reply-type=original
> 
> David R Smith,
> 
> the long term USDA - Smithsonian specialist on sawflies is live and
> well.
> 
> His USDA ARS Systematic Entomology Lab e-mail address should still be
> active
> 
> try david.smith at ars.usda.gov
> 
> Cheers
> 
> Chris Thompson
> from home
> 
> -----Original Message-----
> From: stuartf
> Sent: Friday, June 21, 2013 7:17 AM
> To: entomo-l at listserv.uoguelph.ca ; taxacom at mailman.nhm.ku.edu ;
> ecn-l at listserv.unl.edu
> Subject: [Taxacom] david r smith
> 
> good morning folks - sorry about the cross post
> 
> my how time flies
> 
> they tell me that david r smith - symphyta - smithsonian has retired
> and i
> have managed to loose his e-address
> 
> if anyone can supply - please do
> 
> otherwise
> 
> i am interested in contacting folks within the usa who are working with
> tenthredinidae
> 
> thanks in advance
> 
> reply off list server to cut down on the chatter
> 
> cheers!
> 
> rof
> 
> 
> Stuart M Fullerton ROF, Research Associate Arthropod
> 
> Collection (UCFC), Dept. of Biology, University of Central Florida, PO
> 
> Box 162368, Orlando, Florida, 32816-2368, USA. stuartf at knights.ucf.edu
> 
> <mailto:stuartf at pegasus.cc.ucf.edu>(407) 823-6540 (no voice mail),
> http://bugcloset.cos.ucf.edu/
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> 
> (1) by visiting http://taxacom.markmail.org/
> 
> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom
> your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.
> 
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Fri, 21 Jun 2013 15:05:00 +0000
> From: "Sharkey, Michael J" <msharkey at uky.edu>
> Subject: Re: [Taxacom] david r smith = still active
> To: Chris Thompson <xelaalex at cox.net>, stuartf
>        <stuartf at knights.ucf.edu>,      "entomo-l at listserv.uoguelph.ca"
>        <entomo-l at listserv.uoguelph.ca>,
> "taxacom at mailman.nhm.ku.edu"
>        <taxacom at mailman.nhm.ku.edu>,  "ecn-l at listserv.unl.edu"
>        <ecn-l at listserv.unl.edu>
> Cc: Dave Smith <sawfly2 at aol.com>
> Message-ID:
>        <ADE0C9CF8722084581E6E577699C41DE35803487 at ex10mb02.ad.uky.edu>
> Content-Type: text/plain; charset="us-ascii"
> 
> I correspond with Dr. Smith fairly routinely using the address: Dave
> Smith <sawfly2 at aol.com>
> 
> Dr. Michael Sharkey
> Department of Entomology
> University of Kentucky
> S-225 Ag. Sci. N.
> Lexington, KY 40546-0091
> msharkey at uky.edu
> http://www.sharkeylab.org/
> 
> 
> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
> bounces at mailman.nhm.ku.edu] On Behalf Of Chris Thompson
> Sent: Friday, June 21, 2013 10:10 AM
> To: stuartf; entomo-l at listserv.uoguelph.ca; taxacom at mailman.nhm.ku.edu;
> ecn-l at listserv.unl.edu
> Subject: Re: [Taxacom] david r smith = still active
> 
> David R Smith,
> 
> the long term USDA - Smithsonian specialist on sawflies is live and
> well.
> 
> His USDA ARS Systematic Entomology Lab e-mail address should still be
> active
> 
> try david.smith at ars.usda.gov
> 
> Cheers
> 
> Chris Thompson
> from home
> 
> -----Original Message-----
> From: stuartf
> Sent: Friday, June 21, 2013 7:17 AM
> To: entomo-l at listserv.uoguelph.ca ; taxacom at mailman.nhm.ku.edu ; ecn-
> l at listserv.unl.edu
> Subject: [Taxacom] david r smith
> 
> good morning folks - sorry about the cross post
> 
> my how time flies
> 
> they tell me that david r smith - symphyta - smithsonian has retired
> and i have managed to loose his e-address
> 
> if anyone can supply - please do
> 
> otherwise
> 
> i am interested in contacting folks within the usa who are working with
> tenthredinidae
> 
> thanks in advance
> 
> reply off list server to cut down on the chatter
> 
> cheers!
> 
> rof
> 
> 
> Stuart M Fullerton ROF, Research Associate Arthropod
> 
> Collection (UCFC), Dept. of Biology, University of Central Florida, PO
> 
> Box 162368, Orlando, Florida, 32816-2368, USA. stuartf at knights.ucf.edu
> 
> <mailto:stuartf at pegasus.cc.ucf.edu>(407) 823-6540 (no voice mail),
> http://bugcloset.cos.ucf.edu/
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> 
> (1) by visiting http://taxacom.markmail.org/
> 
> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom
> your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.
> 
> 
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> 
> (1) by visiting http://taxacom.markmail.org/
> 
> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.
> 
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Fri, 21 Jun 2013 16:49:57 +0100
> From: David.King <David.King at open.ac.uk>
> Subject: Re: [Taxacom] A new way to view taxonomic publications
> To: "taxacom at mailman.nhm.ku.edu" <taxacom at mailman.nhm.ku.edu>
> Message-ID:
> 
> <60667691F130CD418BD0D4FC512EF91217B79553AF at SALCEYCMS1.open.ac.uk>
> Content-Type: text/plain; charset="us-ascii"
> 
> Hi Donat
> 
> I'm with Rod,  BHL OCR can be excellent. In ViBRANT we tried to
> replicate Qin Wei's work with BHL to re-assess how 'bad' the OCR is but
> just couldn't get the same poor quality results.
> 
> In general OCR is very good with body text where there are enough cues
> for the software to work on. Errors still creep in particularly with
> special characters, for example, English OCR software really doesn't
> like Latin ligatures and tends to mangle the end of taxon names even
> though the other characters in the name are accurately identified.
> Another common problem arises from using male and female symbols in
> text, made worse because the symbols are normally found in the middle
> of a very terse description, often full of abbreviations, so devoid of
> cues for non-specialist software to follow.
> 
> Indeed, we gave up on some of our experimental ViBRANT work using
> parts-of-speech tagging to identify anomalies in OCR text because the
> OCR was not bad enough.
> 
> Sadly OCR does struggle in two very useful sections of a document: the
> table of contents and the index. Part of the problem lies with a page
> full of 'funny' words not in the software's dictionary ;-) Then there
> are other problems usually to do with layout such as non-aligned
> columns and leading lines which break the OCR accuracy.
> 
> Cheers
> Dauvit
> 
> ---------------------------------------------------------
> 
> > Date: Thu, 20 Jun 2013 14:21:48 +0100
> > From: Roderic Page <r.page at bio.gla.ac.uk>
> > Subject: Re: [Taxacom] A new way to view taxonomic publications
> > To: Donat Agosti <agosti at amnh.org>
> > Cc: taxacom taxacom <taxacom at mailman.nhm.ku.edu>
> > Message-ID: <3D62E585-50CF-451A-BD62-5CCAE0D779B6 at bio.gla.ac.uk>
> > Content-Type: text/plain;    charset=windows-1252
> >
> > Donat,
> >
> > Again, this is repeating myths. The OCR in BHL ranges from excellent
> in places to crappy in places. If it was uniformly bad we couldn't have
> indexed it for names, nor would http://biostor.org/be possible. The
> quality is variable, but we can quantify this. We don't control the
> original OCR, but we can always redo bits if we need to.
> >
> > Indexing OCR documents is a well known problem, and there is a wealth
> of literature of various techniques that can be used  (see
> http://www.mendeley.com/groups/752871/ocr-optical-character-
> recognition/ for an introduction to the literature ). Why do we simply
> say OMG the BHL OCR is bad? Why not be scientific , quantify its
> quality, and exploit the existing technology to improve things?
> >
> > I am constantly flummoxed by our community's assumption that it knows
> what is possible, and what the limit of the state of the art is outside
> its domain. We have barely scratched the surface of what is possible.
> >
> > Regards
> >
> > Rod
> 
> ---------------------------------------------------------
> Roderic Page
> Professor of Taxonomy
> Institute of Biodiversity, Animal Health and Comparative Medicine
> College of Medical, Veterinary and Life Sciences Graham Kerr Building
> University of Glasgow Glasgow G12 8QQ, UK
> 
> Email: r.page at bio.gla.ac.uk
> Tel: +44 141 330 4778
> Fax: +44 141 330 2792
> Skype: rdmpage
> Facebook: http://www.facebook.com/rdmpage
> Twitter: http://twitter.com/rdmpage
> Blog: http://iphylo.blogspot.com/
> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> ORCID id: http://orcid.org/0000-0002-7101-9767
> 
> 
> 
> --
> The Open University is incorporated by Royal Charter (RC 000391), an
> exempt charity in England & Wales and a charity registered in Scotland
> (SC 038302).
> 
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The entire Taxacom Archive back to 1992 may be searched with either of
> these methods:
> Visit: http://taxacom.markmail.org/
> Or use a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your-search-terms-here
> 
> Celebrating 26 years of Taxacom in 2013.
> 
> End of Taxacom Digest, Vol 87, Issue 25
> ***************************************
> 
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
> 
> (1) by visiting http://taxacom.markmail.org/
> 
> (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.

_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom Archive back to 1992 may be searched with either of these methods:

(1) by visiting http://taxacom.markmail.org/

(2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

Celebrating 26 years of Taxacom in 2013.


More information about the Taxacom mailing list