[Taxacom] human involvement (was Re: BioNames)

Roderic Page r.page at bio.gla.ac.uk
Tue Jun 4 03:52:42 CDT 2013


Again, it's the genius of 'and' - we do both.

Perhaps Google is a good example. Their search engine returns all manner of stuff, some of it wonderful, some of it crap. We accept this because our experience is that it often will find what we want, even if we have to dig a little. The web without Google would be unusable.

But Google is also building the "clean bucket", namely their knowledge graph (seeded with content from http://www.freebase.com ). Google gives us BOTH in a search result, the classical list of hits on the left, and increasingly, a card on the right summarising information about the thing Google thinks we are searching for.

So, why can't we do this? Why not strive for clean, structured data (i.e., clean names linked to relevant nomenclatural events), but at the same time (AND IN THE SAME PLACE) give people what we currently have so they have a fighting chance of coming way with some information?

Regards

Rod

On 4 Jun 2013, at 09:18, Richard Pyle wrote:

> I completely agree with Lyubo when it comes to "prospective" content.  My
> comments were strictly in relation to retrospective.
> 
> For the retrospective stuff, there is very clearly a need for a two-pronged
> approach:
> 
> 1) Automatic processes to generate a dirty bucket;
> 2) Manual processes (with some semi-automated guidance) to filter & migrate
> the "dirty bucket" into the clean bucket.
> 
> Both resources ("dirty bucket" and "clean bucket") have their value.  The
> former allows for a lot of content much more quickly, with clever algorithms
> plus human brains allowing a human to track down a specific item of
> interest; but in general cannot be relied upon as a "reference" (only a way
> for a human to locate a reference).  The latter ("clean bucket") is much
> better for serving as a reference; but is much more painstaking to populate
> for retrospective content.
> 
> In any case, as with most things in life, the past solution involves a
> mixture of tactics.  Being a clean-bucket sort of guy myself, I appreciate
> the value and reward of crowd-sourced manual clean-up of retrospective
> content.
> 
> Rich
> 
>> -----Original Message-----
>> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
>> bounces at mailman.nhm.ku.edu] On Behalf Of Lyubomir Penev
>> Sent: Monday, June 03, 2013 9:31 PM
>> To: Doug Yanega; Taxa com
>> Subject: Re: [Taxacom] human involvement (was Re: BioNames)
>> 
>>> The answer is simple: less automated and more human involvement in the
>> aggregation process, combined with an efficient and transparent system for
>> feedback regarding errors, by anyone who notices them (preferably with the
>> history all logged and publicly archived) ...
>>> 
>> 
>> I would say both automated and human-driven linking should co-exist in
>> parallel. This is the approach we have taken in the currently launched
> Pensoft
>> Writing Tool (PWT) <http://pwt.pensoft.net>. Automated linking exists but
> it
>> does not prevent authors to add several dedicated links for a taxon name
> to
>> sources they would like to cite, e.g., pages on BHL or articles in
> BioNames,
>> where a taxonomic or nomenclatural act has been either firstly published
> or
>> subsequently revised. Moreover authors can comments on a particular act in
>> additional field associated with the link.
>> 
>> Regards,
>> Lyubomir
>> 
>> 
>> On Tue, Jun 4, 2013 at 1:13 AM, Doug Yanega <dyanega at ucr.edu> wrote:
>> 
>>> On 6/3/13 1:57 PM, Stephen Thorpe wrote:
>>>> The answer is simple: less automated and more human involvement in
>>>> the
>>> aggregation process, combined with an efficient and transparent system
>>> for feedback regarding errors, by anyone who notices them (preferably
>>> with the history all logged and publicly archived) ...
>>>> 
>>> Many of us have been saying this for many years now.
>>> 
>>> This approach, however, faces an extreme challenge from either side of
>>> the equation: (1) you can't actually design a proposal where ALL of
>>> the necessary labor would be funded, because that would require hiring
>>> thousands of people (some of you will recall the "All-Species"
>>> initiative, which promised to do just that), and (2) if the labor is
>>> all unpaid (an approach which, in essence, has been/is being tried),
>>> then any such proposal won't offer much incentive (or guidance) to the
>>> volunteers - and good luck finding several thousand skilled volunteers
>>> who all have access to all of the necessary original literature.
>>> 
>>> Realistically, one either needs to improve the incentive for
>>> participation in the system (such as making participation effectively
>>> mandatory, as happens with GenBank), or find a way to fund a large but
>>> manageable number of experts who can then coordinate and oversee
>>> volunteers within a discipline (or do much of the work themselves,
>>> full-time). On my more cynical days, it occurs to me that no one is
>>> interested in the latter approach because, if the taxasphere was
>>> divvied up into equal parcels (e.g., having 200 experts responsible
>>> for, say, no fewer than 10,000 taxa apiece), at least 75% of the
>>> positions would go to invertebrate taxonomists.
>>> 
>>> Sincerely,
>>> 
>>> --
>>> Doug Yanega      Dept. of Entomology       Entomology Research Museum
>>> Univ. of California, Riverside, CA 92521-0314     skype: dyanega
>>> phone: (951) 827-4315 (disclaimer: opinions are mine, not UCR's)
>>>              http://cache.ucr.edu/~heraty/yanega.html
>>>   "There are some enterprises in which a careful disorderliness
>>>         is the true method" - Herman Melville, Moby Dick, Chap. 82
>>> 
>>> 
>>> _______________________________________________
>>> Taxacom Mailing List
>>> Taxacom at mailman.nhm.ku.edu
>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>> 
>>> The Taxacom Archive back to 1992 may be searched with either of these
>>> methods:
>>> 
>>> (1) by visiting http://taxacom.markmail.org
>>> 
>>> (2) a Google search specified as:  site:
>>> mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>>> 
>>> Celebrating 26 years of Taxacom in 2013.
>>> 
>> 
>> 
>> 
>> --
>> Dr. Lyubomir Penev
>> Managing Director
>> Pensoft Publishers
>> 13a Geo Milev Street
>> 1111 Sofia, Bulgaria
>> Fax +359-2-8704282
>> www.pensoft.net
>> Services for scientific projects:
>> http://www.pensoft.net/services-for-scientific-
>> projects<http://www.pensoft.net/projects>
>> Services for journals: http://www.pensoft.net/services-for-journals
>> _______________________________________________
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>> 
>> The Taxacom Archive back to 1992 may be searched with either of these
>> methods:
>> 
>> (1) by visiting http://taxacom.markmail.org
>> 
>> (2) a Google search specified as:
>> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>> 
>> Celebrating 26 years of Taxacom in 2013.
> 
> 
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom Archive back to 1992 may be searched with either of these methods:
> 
> (1) by visiting http://taxacom.markmail.org
> 
> (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
> 
> Celebrating 26 years of Taxacom in 2013.
> 

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
Skype: rdmpage
Facebook: http://www.facebook.com/rdmpage
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
Wikipedia: http://en.wikipedia.org/wiki/Roderic_D._M._Page
Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ORCID id: http://orcid.org/0000-0002-7101-9767




More information about the Taxacom mailing list