[Taxacom] Towards a consensus higher classification oforganisms (was: List of Orders of the world), misspellings, etc...

Roderic Page r.page at bio.gla.ac.uk
Mon Jun 23 06:45:25 CDT 2008


Dear Paul,

> So how does one exploit Elsevier, Thompson-Reuter (et al.) within a  
> legal framework ... it appears to me that the reverse is the case  
> right now ... ;-)

I think this is the key issue, and exploring ways to do this that  
Elsevier won't feel threatened by would be one way forward. In many  
ways, what we need is not so much the full text, but an index to the  
full text. Botanists are fortunate to have developed such a service  
(IPNI) independently of a commercial enterprise -- zoologists have not  
been so lucky (Zoological Record).

Regards

Rod




On 23 Jun 2008, at 09:22, Paul Kirk wrote:

> 1 - but critical to ecosystem functions which support all life on  
> earth... ;-)
> 2 - horses for courses - readable JPGs are a virtual Library not  
> designed for text mining; both JPGs and fully OCR'd and corrected  
> PDFs are both required for what we want to do
> 3 - it is inefficient so please, someone ask/tell/make JSTOR (et  
> al.) drop the requirement for subscriptions otherwise Cyberliber and  
> BHL (et al.) will duplicate what JSTOR (Google, et al.) have already  
> done so everything is open
>
> So how does one exploit Elsevier, Thompson-Reuter (et al.) within a  
> legal framework ... it appears to me that the reverse is the case  
> right now ... ;-)
>
> Paul
>
> From: Roderic Page [mailto:r.page at bio.gla.ac.uk]
> Sent: 21 June 2008 14:41
> To: Taxacom
> Cc: Paul Kirk
> Subject: Re: [Taxacom] Towards a consensus higher classification  
> oforganisms (was: List of Orders of the world), misspellings, etc...
>
> Cyberliber is a nice resource, but:
>
> 1. Fungi are a small fraction of life (see http://darwin.zoology.gla.ac.uk/~rpage/tm/ 
> , which uses a log scale -- http://darwin.zoology.gla.ac.uk/~rpage/tm/nolog.php 
>  is a better reflection).
>
> 2. JPG files don't permit easy text mining (to extract information  
> on localities, synonymies, literature, for example).
>
> 3. It seems inefficient to ignore the digitisation efforts underway  
> by publishers and other digital repositories.
>
> Yes, JSTOR is closed (i.e., subscription based), but given the  
> choice between JSTOR and nothing, I'd choose JSTOR. Some publishers  
> are digitising back issues of publications. The Royal Society of  
> London has digitised content back to the 18th century. Blackwell's  
> is digitising society journals.
>
> Let's try to exploit all the resources that are available to us.
>
> Regards
>
> Rod
>
>
>
> On 21 Jun 2008, at 09:12, Paul Kirk wrote:
>
>> At the risk of being accused of repeating myself ... Index Fungorum  
>> has an active policy of linking names to digitized literature in  
>> the form of simple jpg images, via resolvable URLs, which any  
>> browser can display without the need to use obstrusive 'plug-ins'.  
>> Cyberliber has about 150,000 such jpg files and IF links to about  
>> 30,000 of these which are pages where names have been published. A  
>> further 250,000 names have links to page images from the scanned  
>> printed indexes which have documented names of fungi for the last  
>> 100 or so years.
>>
>> JSTOR is 'closed' because it's subscription only so not much use to  
>> most of the world; BHL has an over-engineered human interface for  
>> fingers on keyboards and no URLs which simply resolve to a jpg.
>>
>> If we keep it simple it is a rather trival exercise to add links to  
>> literature ... if we make it whizzy for the human at the keyboard  
>> it becomes either a 'non trival exercise' or it's impossible.
>>
>> Paul
>>
>> From: taxacom-bounces at mailman.nhm.ku.edu on behalf of Roderic Page
>> Sent: Fri 20/06/2008 11:27
>> To: Taxacom
>> Subject: Re: [Taxacom] Towards a consensus higher classification  
>> oforganisms (was: List of Orders of the world), misspellings, etc...
>>
>> One obstacle to points 1 and 2 below is the accessibility of the
>> literature. Existing efforts to catalogue names (such as the  
>> Catalogue
>> of Life) treat literature as a second class citizen. I think the  
>> value
>> of such efforts would be greatly enhanced if the names were linked to
>> literature so that anybody could check the original spelling, etc.
>>
>> Of course, this depends on having that literature in digital form.  
>> But
>> it also depends on being able to find that digital version of the
>> publication. Journal publishers, archival projects such as JSTOR, and
>> institutional repositories are continually increasing the number of
>> taxonomic publications that are online (not to mention BHL's
>> activities).
>>
>> However, there is a disconnect between how many nomenclatural
>> databases treat literature (often referring not to the article, but  
>> to
>> a specific page in an article) and how digital publishers refer to
>> articles (the article itself is the smallest unit).
>>
>> This means that Anders Silfvergrip's (http://markmail.org/message/27jw6g4owltpd252
>>   ) attempt to find out the largest number of names described  in a
>> single paper using the Catalogue of Life was doomed from the start.
>> Many reference in CoL are to pages within an article, not to the
>> enclosing article itself.
>>
>> Hence, an unintended consequence of how many nomenclatural databases
>> handle literature is that linking to the burgeoning digital  
>> literature
>> is going to be a non trivial exercise.
>>
>> We don't make things easy for ourselves...
>>
>> Regards
>>
>> Rod
>>
>>
>>
>>
>>
>>
>> >
>> > Why not expand those hardly attractive report-error-to-the- 
>> webmaster
>> > functions by displaying review details received from expert  
>> users. I
>> > imagine such annotations could enormously increase the credibility
>> > of online biodiversity information projects.
>> >
>> > 1) Source verification: spelling of name, authorship and date have
>> > been compared with the original publication. Show name of person(s)
>> > who did the check.
>> > 2) nomenclatural status assessment: available/ validly published or
>> > not, with annotations on details (reasons for unavailability,
>> > homonymy, etc.). Name of person(s) who did the check.
>> > 3) taxonomic status: valid/ accepted or not, subjective synonymy,
>> > hints on alternative classifications (if needed), etc. Name of
>> > expert(s) who provided the information.
>> >
>> > In this way, experts could take the control and show responsibility
>> > for data,  - an important step from more or less anonymous machine
>> > work to expert controlled work, IMHO.
>> > No?
>> >
>> > Cheers,
>> > Wolfgang
>> >
>> > ------------------------------------------
>> >
>> > Wolfgang Lorenz
>> > Hoermannstr.4
>> > D-82327 Tutzing, Germany
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > -----Ursprüngliche Mitteilung-----
>> > Von: Tony.Rees at csiro.au
>> > An: taxacom at mailman.nhm.ku.edu
>> > Verschickt: Sa., 14. Jun. 2008, 9:10
>> > Thema: Re: [Taxacom] Towards a consensus higher classification of
>> > organisms (was: List of Orders of the world), misspellings, etc...
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Brian Tindall wrote:
>> >
>> >> While
>> >> it is easy to exchange information on the
>> >> Internet it is problematic to work out what is
>> >> reliable. A quick search of iSpecies, for example
>> >> for "Jonesia" gives me reference to a prokaryote
>> >> name, pictures of flowers and most references
>> >> listed are to the prokaryote taxon - Jonesia
>> >> Brady, 1866 is also an ostracod, according to
>> >> ION. We all know that curating even a
>> >> nomenclatural data base is expensive and time
>> >> consuming, and this rises exponentially as the
>> >> links to other data sets increases.
>> >
>> > Inadvertently I am sure, you have hit on precisely one problem for
>> > which
>> > IRMNG (and the super-mega-system of which David is dreaming - in  
>> the
>> > nicest possible way) is able to address - that of disambiguation of
>> > genus level homonyms, with associated (if sometimes imperfect)  
>> higher
>> > taxonomy information, and habitat/extant flags to boot. If you type
>> > exactly that query into IRMNG, i.e. genus = jonesia, as of now you
>> > will
>> > get a range of all (or at least) most "Jonesia" options, plus a  
>> swag
>> > of
>> > near matches for good measure:
>> >
>> > -------------------
>> >
>> > Genus name entered: jonesia
>> >
>> > (Genus   ...   Family   ...   source   ...  Kingdom- 
>> Order    ...   is
>> > synonym?)
>> >
>> > Jonesia Brady, 1866 (2)
>> > Bythocytheridae    SN2000  Animalia-Podocopida
>> >
>> > Jonesia M. Bizot, R.B. Pierrot & T. Pocs, 1974 (0)
>> >  Funariaceae   Index Nominum Genericorum   Plantae-Funariales   S
>> >
>> > Jonesia Roxburgh, 1795 (12)
>> >  Fabaceae  CoL2006 Plantae-Fabales
>> >
>> > Jonesia (1)
>> >  Jonesiaceae   SN2000/Garrity et al., 2001 Bacteria-Bacteria
>> > (unallocated)
>> >
>> > Genus nearest matches: Jonesea (Animalia-Strophomenida) , Jonesina
>> > (Animalia-Palaeocopida) , Onesia (Animalia-Diptera)
>> > Other genus near matches: Jainesia R.G. Fragoso & R. Ciferri, 1925
>> > (Fungi-Hyphomycetes (unallocated)) , Jamesia C.G.D. Nees in
>> > Wied-Neuwied, 1840 (Plantae-Asterales) , Jamesia J. Torrey & A.  
>> Gray,
>> > 1840 (Plantae-Rosales) , Jamesia Rafinesque, 1832 (Plantae- 
>> Fabales) ,
>> > Jamesia (Animalia-Coleoptera) , Janasia Rafinesque, 1838
>> > (Plantae-Scrophulariales) , Janischia Grunow in Van Heurck, 1883
>> > (Plantae-Bacillariophyceae (unallocated)) , Jansia Penzig, 1899
>> > (Fungi-Phallales) , Janusia A.H.L. Jussieu ex Endlicher, 1840
>> > (Plantae-Polygalales) , Janusia (Animalia-Araneae) , Jensia B.G.
>> > Baldwin, 1999 (Plantae-Asterales) , Joannesia Vellozo, 1798
>> > (Plantae-Euphorbiales) , Joannisia (Animalia-Diptera) , Jonesius
>> > Sankarankutty 1962 (Animalia-Decapoda) , Joosia G.K.W.H. Karsten,  
>> 1859
>> > (Plantae-Rubiales)
>> >
>> > --------------
>> >
>> > Now some of this may have errors in the higher classification, as  
>> we
>> > have established, but probably nothing too mission-critical.
>> >
>> >> From the above you can either follow links to the included species
>> >> as I
>> > have been able to locate them thus far, and search on these (e.g.  
>> on
>> > iSpecies or whatever is your taste), or maybe search on your  
>> preferred
>> > instance of "Jonesia" *plus* the authority, or the genus *plus*  
>> (e.g.)
>> > Kingdom or order), any of which is better than searching on the  
>> genus
>> > name alone. Also (omitted from the above for clarity), IRMNG will  
>> tell
>> > you that of these, only the first is a marine genus, and that all  
>> are
>> > extant but that nos. 1 and 3 also have fossil representatives.
>> >
>> > The key point of all the above is that without IRMNG, or something
>> > that
>> > does an equivalent job, there is no one place that all of this
>> > information PLUS associated classifications is pulled together. In
>> > this
>> > instance, Jonesia #s 1, 3 and 4 do occur in the Catalogue of Life  
>> (we
>> > got lucky this time), but without any associated genus authors,  
>> while
>> > Jonesia #2 does not. Jonesia #s 2 and 3 occur in Index Nominum
>> > Genericorum, but #s 1 and 4 do not (since they are not within that
>> > list's scope). Jonesia #1 contains 2 species currently in IRMNG,  
>> one
>> > of
>> > which occurs in the Cataloge of Life and the other does not (it  
>> is in
>> > the NW Atlantic Marine Species Register held at VLIZ, of which the
>> > custodians graciously gave me a copy, along with 17 of their other
>> > databases).
>> >
>> > Of course the situation will be repeated again with the 18 "near
>> > match"
>> > Jonesias identified by my TAXAMATCH algorithm, which you might also
>> > have
>> > meant if you were not a very good typist.
>> >
>> > By interrogating the species level as well, an agency such as OBIS
>> > (the
>> > stimulus for the conception behind this work) will also be able to
>> > tell
>> > that "Jonesia simplex" is marine (an ostracod), therefore its
>> > distribution data fall within their remit, while "Jonesia
>> > confusa" (the
>> > higher plant) is not, from the name alone, by a simple web or  
>> machine
>> > level query.
>> >
>> > This power (for want of a better word) is what continues to make me
>> > believe in the usefulness of bringing these resources together  
>> into a
>> > seamless (real or virtual) collection, for single point of query as
>> > well
>> > as compilation of whatever ancillary info might be useful for a
>> > particular requirement. Will the EOL, or GBIF, do this for us in  
>> the
>> > future? Possibly, but not today at any rate. Can we do it better?
>> > Undoubtedly. Who should really be doing it? I'm not sure,  
>> actually...
>> >
>> > Hope this helps give a concrete illustration of where I am coming
>> > from,
>> > how far I have currently got, and (probably) also the distance  
>> still
>> > left to travel.
>> >
>> > Regards - Tony
>> >
>> >
>> > _______________________________________________
>> > Taxacom mailing list
>> > Taxacom at mailman.nhm.ku.edu
>> > http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>> >
>> >
>> >
>> >
>> >
>> >
>> >  
>> ________________________________________________________________________
>> > Bei AOL gibt's jetzt kostenlos eMail für alle.  Klicken Sie auf
>> > AOL.de um heraus zu finden, was es sonst noch kostenlos bei AOL  
>> gibt.
>> > _______________________________________________
>> > Taxacom mailing list
>> > Taxacom at mailman.nhm.ku.edu
>> > http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> ---------------------------------------------------------
>> Roderic Page
>> Professor of Taxonomy
>> DEEB, FBLS
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QQ, UK
>>
>> Email: r.page at bio.gla.ac.uk
>> Tel: +44 141 330 4778
>> Fax: +44 141 330 2792
>> AIM: rodpage1962 at aim.com
>> Facebook: http://www.facebook.com/profile.php?id=1112517192
>>
>> http://iphylo.blogspot.com
>> http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>>
>>
>>
>>
>>
>> _______________________________________________
>> Taxacom mailing list
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> P Think Green - don't print this email unless you really need to
>>
>> ************************************************************************
>> The information contained in this e-mail and any files transmitted  
>> with it is confidential and is for the exclusive use of the  
>> intended recipient. If you are not the intended recipient please  
>> note that any distribution, copying or use of this communication or  
>> the information in it is prohibited.
>>
>> Whilst CAB International trading as CABI takes steps to prevent the  
>> transmission of viruses via e-mail, we cannot guarantee that any e- 
>> mail or attachment is free from computer viruses and you are  
>> strongly advised to undertake your own anti-virus precautions.
>>
>> If you have received this communication in error, please notify us  
>> by e-mail at cabi at cabi.org or by telephone on +44 (0)1491 829199  
>> and then delete the e-mail and any copies of it.
>>
>> CABI is an International Organization recognised by the UK  
>> Government under Statutory Instrument 1982 No. 1071.
>>
>> **************************************************************************
>>
>
> ---------------------------------------------------------
> Roderic Page
> Professor of Taxonomy
> DEEB, FBLS
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QQ, UK
>
> Email: r.page at bio.gla.ac.uk
> Tel: +44 141 330 4778
> Fax: +44 141 330 2792
> AIM: rodpage1962 at aim.com
> Facebook: http://www.facebook.com/profile.php?id=1112517192
>
> http://iphylo.blogspot.com
> http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>
>
>
>
>

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
DEEB, FBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192

http://iphylo.blogspot.com
http://taxonomy.zoology.gla.ac.uk/rod/rod.html








More information about the Taxacom mailing list