[Taxacom] Data query
dpwijesinghe at yahoo.com
Sun Jun 23 07:26:01 CDT 2013
Some outstanding examples of authoritative dedicated ('clean bucket') websites:
Many more included here:
I for one fail to see the point of 'mega-projects' using copied, re-copied and 'machine-read' data.
D. P. Wijesinghe
dpwijesinghe at yahoo.com
From: Bob Mesibov <mesibov at southcom.com.au>
To: TAXACOM <taxacom at mailman.nhm.ku.edu>
Sent: Sunday, June 23, 2013 4:29 AM
Subject: [Taxacom] Data query
David Campbell wrote (and I'm glad he did):
"Identification of bucket quality will tend to clash with the self-interest of promoting how great one's project is or how well the grand promises of a grant application have been fulfilled."
However, those of us building clean buckets independently (what I've been calling 'bottom-up' resources here on Taxacom) face the same 'meta' problem that the dirty-bucket acronyms do, namely how to make the data in our clean buckets easily retrievable and manipulable, in our case by interested people other than dirty-bucket-builders.
Plain, garden-variety webpages are excellent outlets for clean buckets, expecially taxonomic information. However, webpages have been justifiably called 'silos' by those in the acronym industry, because the taxonomic/nomenclatural/bibliographic data on the page are typically related only by text structure or formatting. This mean the relationships between data items on the page are almost entirely *interpreted* in the mind of the reader; they're not *explicitly* on the page.
One way to improve clean-bucket webpages would be to mark them up with XML or microformats. The markup is invisible in the Web browser, but structured data is harvestable from the underlying text file, the 'real' webpage. Problem is, there isn't now and is unlikely to ever be a simple 'DarwinCore' scheme for taxonomic/nomenclatural/bibliographic data that will make all of us clean-bucket builders comfortable. [I've got my own XML scheme for millipede information (not yet on my website), but it's idiosyncratic and I can't imagine the spider people would like it - not an <allotype> in sight. And there isn't yet a librarian-style 'authority file' that all us clean-bucket-builders can refer to when citing a particular taxonomic reference - wouldn't that be nice!]
Please note that I'm not talking about species occurrence data. These can always be simply represented and manipulated in tables. The structure of taxonomic/nomenclatural/bibliographic data OTOH is 'un-table-able'.
Are we any closer to structuring relationship data in ways that could be made explicit in webpage markup? Again, I don't mean reporting structured data *to* a webpage for display, from a behind-the-scenes database. I mean including in the webpage the structuring used in the database. I see that graph databases are becoming more amenable (import from CSV, for example). Anyone have experience with graph DBs and their outputs?
Dr Robert Mesibov
Honorary Research Associate
Queen Victoria Museum and Art Gallery, and
School of Agricultural Science, University of Tasmania
PO Box 101, Penguin, Tasmania, Australia 7316
(03) 64371195; 61 3 64371195
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
The Taxacom Archive back to 1992 may be searched with either of these methods:
(1) by visiting http://taxacom.markmail.org
(2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
Celebrating 26 years of Taxacom in 2013.
More information about the Taxacom