paddy at eol.org
Wed Sep 16 06:49:16 CDT 2009
Before you can fix problems, you need to know what they are.
This listing serves minimally two masters, the taxonomist and the informatician. The informatician must be aware of every 'string' that has been used as a name and therefore points to some potentially useful piece of information. Within the totality of all strings lies a subset of interest to the taxonomist. Team taxonomy (the sum of all active taxonomists past and present) has a suite of rules to follow, but within them there remains considerable latitude. The GN thing is intended to develop into an infrastructure that will serve all users equally well, and because of this, it has to be inclusive of different points of view and solutions. It can achieve that through modularity. At the core lies the GNI module - the list of all names. Additional modules (in the form mostly of web services that call upon more specialist listings and projects) can be used to (for example) use validated taxonomic lists to show only the names that taxonomists like, or provide editing environments to improve the quality of the data, or perhaps offer parsing algorithms that will break names into their component parts and reassemble them into different forms to suit different users. Both through modularity and through federation, a names infrastructure can be designed to pick out only those subsets of information that are needed by particular classes of users, and then to present that information in a form that suits individual users.
The progress from raw material to a structure that meets all our needs will be a long haul, will take much time, good will, and participation. But, the benefits of a biology integrated through a semantic names-based infrastructure make the walk well worth while, as are the conversations that accompany the promenade.
----- Original Message -----
From: dipteryx at freeler.nl
To: taxacom at mailman.nhm.ku.edu
Sent: Wednesday, September 16, 2009 4:31:03 AM GMT -05:00 US/Canada Eastern
Subject: Re: [Taxacom] globalnames?
Van: Tony.Rees at csiro.au [mailto:Tony.Rees at csiro.au]
Verzonden: di 15-9-2009 22:05
>I will anticipate any response from the "real" GNA/GNI developers
>by pointing out that in their concept, this content (Global Names
>Index) is the "raw" material from which the "cleaned" lists such as
>you describe can be assembled, so that will certainly happen and be
>much more pleasant to look at (cue David Remsen and Rich Pyle...).
>In addition, the application of algorithms and later expert review
>to reconcile spelling errors is happening as we speak - the first
>large scale trials (18.5 million names) being done yesterday, with
>preliminary - if still slightly imperfect - results appearing on
>the results page as you click on any name instance...
>Onwards and upwards (or perhaps: forwards in all directions...)
>Regards - Tony
Oh yes, I am very much aware of the difficulties involved.
Still, I would have expected a project with aims such as
"GNI is a big collection of Scientific Biological Names ...
GNI is meant to collect all Scientific Names created, cited,
misspelled in one place and then make sense of them"
to at least list scientific names, by scientific name (instead
of by text string of scientific name plus authorship). There
are enough problems (see the post by Brian Tindall) without
adding yet another level of error and confusion?
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom archive going back to 1992 may be searched with either of these methods:
Or (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
David J Patterson
Senior Taxonomist, EOL
Marine Biological Laboratory
Woods Hole, Massachusetts 02543, USA.
(+) (1) 508 289 7260
dpatterson at mbl.edu
More information about the Taxacom