[Taxacom] A romp through an aggregator

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Tue May 25 17:09:29 CDT 2010

thinking more in terms of "the big picture", rather than dwelling on details, I would suggest that the biggest danger arises from an overt or implied claim by any biodiversity database (GBIF, EOL, COL, Wikispecies, Bob's millipede website, etc.) to be the one true and trustworthy source of such information (implying that the others are inferior and can be dispensed with). What makes Wikispecies different is that it doesn't try to claim reliability by authority, but instead says to the user: here is the information you want, here are the sources of that info, make of it what you will...
Experience suggests that many "serious users" (biosecurity and conservation agencies, governments, etc.) want to be "spoon fed" data that has already been verified and validated by experts, and the "propaganda" associated with GBIF, EOL, etc. seems to imply that they have this covered, but as we have seen, it ain't necessarily so, and the more automated the process, the more likely that errors will slip through, and when errors do get into a closed database, they are very difficult to fix. I put links to GBIF, EOL, etc. on Wikispecies pages so that users can see alternative views, but I don't see many links to Wikispecies on GBIF, etc. pages ...


From: David Remsen (GBIF) <dremsen at gbif.org>
To: Bob Mesibov <mesibov at southcom.com.au>
Cc: TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Tue, 25 May, 2010 11:53:40 PM
Subject: Re: [Taxacom] A romp through an aggregator


Thank you for the prompt reply but I don't think I was interpreted  
correctly and it may be that there isn't clarity on what was being  
asked for.  A simple example is a list of all millipede species  
names, their taxonomic status, synonymy and classification in a format  
that enables this information to be re-integrated with other data.      
In terms of a pipeline from where,  I meant from your own database.  
Not the EOL-style web portal or the ALA but the source database (I  
presume yours) where the knew information originates or is  
collated.    I would hope that any data that you might provide could  
be accessible in a common format so that getting correct information  
on millipedes is approximately the same as getting correct information  
on siphonophores.  I assume that a bottom-up approach in one that  
puts that sort of capability as close to you as possible and that EoL,  
ALA, AFD, GBIF, and others would come to you for that information and  
not need to pass around outdated copies whose main attraction is that  
they are in a usable format.  That, to me, would be preferred and  
it's what a common infrastructure is intended to achieve.  It might  
appear to have some sort of top down authority if you choose to see it  
that way but it's really that a third party, interested in better  
access to this valuable information has tried to identify potential  
users of it, capture their specific requirements and provide some  
framework for it's delivery because they are issues that cross-cut  
taxonomy and take away from what you do best.  I had hoped it was  
complementary but clearly it doesn't come across that way.

Putting a PDF of a paper online and telling someone to dig it out  
won't work.  Someone might go to the trouble to do it alright but  
that's how they out-of-date copies get circulated in the first  
place.    We can do better and we can make the more valuable and  
updated work that is done more accessible and useful.

Lastly, when I said I want consistent, comprehensive and fast I'm  
talking about facilitating access to existing quality data like yours  
not to cut corners in the quality process.


On May 25, 2010, at 12:59 PM, Bob Mesibov wrote:

> "How would you approach providing a more direct and consistent  
> pipeline to the sort of data being discussed here in a fashion that  
> scales and actually meets others requirements?"
> Pipeline from where? An EOL-style Web portal? But in this particular  
> case there are already 3 portals: my Millipedes of Australia site,  
> Wikispecies and the Australian Faunal Directory, which will feed  
> into the Atlas of Living Australia (AFD is out of date but I'm  
> currently updating it). Or you could see Google as a portal, since  
> it links to all 3 of these. The generic nomenclator I referred to  
> (published in print in 1971) was updated to 2000 and is available as  
> a downloadable spreadsheet on an NSF-funded PEET website. My MoA  
> site has much more taxonomic info and a locality mapper built on a  
> database of specimen records I maintain. There is a great deal of  
> millipede information already available and more can be accessed by  
> asking specialists, like me, or the diligent German workers who've  
> scanned to PDF the bulk of the millipede literature and who make  
> individual works available for study purposes on request. If the  
> dream is to sit at a computer somewhere and access *all infor
> mation about every millipede species* at the single click of a mouse  
> - sorry, it ain't going to happen. That pipeline is never going to  
> be built. There aren't enough millipede experts, and they're already  
> far too busy.
> "How can others who are more detached from grass-roots navigate to  
> the sources we need?"
> Same way we always did: library/online searching to the roots  
> (primary sources) and ask-an-expert. But just as before, don't say  
> 'Tell me everything you know about millipedes', which is what the  
> aggregation industry seems to be doing.
> "However, if by "top-down" you mean there is no merit in trying to  
> come up with a set of agreements that allow these data to output in  
> a consistent manner, discoverable in a comprehensive manner, and  
> accessed in as near-real time as possible, then I disagree."
> You want consistent, comprehensive and fast. I want specific answers  
> and I want them to be correct, and I don't care if that takes time.  
> When you sacrifice data quality for speed and consistency you get  
> the horrible example I described.
> -- 
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Zoology, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> 03 64371195; 61 3 64371195
> Webpage: http://www.qvmag.tas.gov.au/mesibov.html


Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu

The Taxacom archive going back to 1992 may be searched with either of these methods:

(1) http://taxacom.markmail.org

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here


More information about the Taxacom mailing list