[Taxacom] Data query

Bob Mesibov mesibov at southcom.com.au
Mon Jun 24 20:13:46 CDT 2013

Stephen Thorpe wrote:

"You mean until we have one place where anyone can "just do it" and add missing data at their leisure with minimum fuss, to help build a comprehensive catalogue of world biota ... oh, wait! We do have such a place ... Wikispecies!"

Tony Rees wrote:

"Also, most informatics persons (biological data specialists) would probably contend that there are more appropriate data structures than wikispecies for bulk importing, internal data management and review, bulk queries (including machine-machine as well as human), and bulk export of relevant content (which is why the bulk of present taxonomic information resides in databases, with web pages as by-products, rather than in web pages as their native format)."

I'm having trouble understanding Tony's argument. Wikispecies is not primarily a data storage and management structure. It's the equivalent of the 'web pages as by-products' made from databases, a way to get the results of taxonomic activity made widely known. Stephen builds web pages by hand, databasers export web pages from their databases, but in both cases the information put online comes from the taxonomic literature, yes? From the point of view of the Web user looking at the information, there's no in-principle difference. Wikispecies is less complete than some databases, but on the other hand Wikispecies is often more up to date, and sometimes more accurate.

It's not a criticism of Wikispecies to say that it's no good for 'bulk importing, internal data management and review, bulk queries (including machine-machine as well as human), and bulk export of relevant content'. That's like criticising cars because they don't fly like airplanes do. But they get you from A to B just the same.

It's also not a criticism of database managers to say that they don't allow just anyone to edit their data, the way Wikispecies does, at the Web-output stage. Database managers ask that users suggest their edits 'off-Web', so the database can be changed, then the changes exported to the Web. It's a different mechanism for editing.

Which brings me back to that first post of mine, that pushed some unintended buttons. What *are* appropriate data structures for storing the complex relationships involved in taxonomic, nomenclatural and bibliographic data? And are those structures capable of being marked up on a webpage? If so, then the gap between Wikispecies and Big Databases disappears. The marked-up page can be generated by a database, or built/edited by hand.

I asked if anyone had experience with graph databases because they seem to me to be the logical way to store and manipulate objects ('nodes' = names, authors, publications, type specimens...) and relationships ('published by', 'cited by', 'synonym of'...). Doing this with RDBMS and joins seems to be out of the question for anything but very simple cases.
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Agricultural Science, University of Tasmania
Home contact:
PO Box 101, Penguin, Tasmania, Australia 7316
(03) 64371195; 61 3 64371195

More information about the Taxacom mailing list