[Taxacom] Data query

Tony.Rees at csiro.au Tony.Rees at csiro.au
Mon Jun 24 23:50:59 CDT 2013


Bob - I take your point(s) especially with the regard to the minimal latency with which both you and Stephen are able to update web pages I have called "static" (meaning directly coded, as opposed to dynamically created from an underlying data store). Of course latency (minimal or large) is not inherently a feature of either system but rather a by-product of the workflows that have been created around them. I can also edit my database in real time resulting in minimal latency for the derived web pages - in fact I do this all the time (probably 500+ edits in the past week). Or I could wait to periodically refresh some portions of the content at a much longer interval (for example I have not re-crawled the Catalogue of Life for updates since 2006, my choice but to be rectified at some point) which means that yes, that portion of my system has not been updated for a possibly over-long time.

Paul Kirk, Chris Thompson and others who maintain database-driven web information systems will all concur that it is not the system, but the editor (and processes s/he has set in place) who has control over the latency thereof... but of course I agree, anything which serves to reduce such latency - including for example permitting multiple user entry such as wikispecies (static pages in my previous definition) and WoRMS (dynamic ones) - is good.

You also said:

> I started this thread by saying that clean-bucket resource builders
> could use a webpage markup system that allowed people like Tony to
> harvest webpage contents more easily, for uses other than those for
> which the clean buckets were designed.

My comment on this would be that clean data is useable data, why differentiate between who might use it in the future and for what purpose?

Perhaps it is time to move the discussion on to related matters such as how to possibly chain these systems together in the future so that updates may more readily flow from the point of entry to potentially multiple points of re-use...

Regards - Tony


> -----Original Message-----
> From: Bob Mesibov [mailto:mesibov at southcom.com.au]
> Sent: Tuesday, 25 June 2013 2:38 PM
> To: Rees, Tony (CMAR, Hobart)
> Cc: stephen_thorpe at yahoo.co.nz; taxacom at mailman.nhm.ku.edu;
> deepreef at bishopmuseum.org
> Subject: Re: [Taxacom] Data query
> 
> Tony Rees wrote:
> 
> "Basically we are talking about the generic area of computerized
> compilations of taxonomic data and associated attributes, and the
> benefits of storing these in static web pages as opposed to generating
> dynamic ones on demand from a data store of some sort."
> 
> Well, I'd restate that:
> 
> '...and the benefits of storing these in frequently edited webpages
> (confusingly called 'static') tended by interested people, as opposed
> to generating rarely edited webpages (confusingly called 'dynamic') on
> request from data stores whose page-related content hasn't changed
> since the last machine harvest of some other data store, and for whose
> content there are limited and difficult editing opportunities.'
> 
> Like Stephen, I could give you a 'f'rinstance' or two. When I find a
> new publication mentioning a named Australian millipede, the 'static'
> pages on the Millipedes of Australia (MoA) website get correctly
> updated with synonymy and bibliographic info within a few hours. I
> don't know any large data stores that do synonymies correctly; I do
> know that MoA is always ahead with regard to new species.
> 
> Similarly, on Sunday I ID'ed new specimens of the rare species
> Atrophotergum bonhami Mesibov, 2004. It took me less than a minute to
> add the collecting event and the ID and other info to my data tables.
> It took a few seconds each for my shell scripts to generate updated
> Atrophotergum CSV records files and KML files, and another minute to
> FTP those updates to the MoA server. (The pre-registered specimens
> themselves will be on a museum shelf in a few weeks; if I worked at a
> museum the delay would be minutes.) Those 'static' CSV and KML
> downloads, each dated in the file name, have records that won't appear
> for years in some online records databases, and when they do the
> occurrence information will be truncated and may be plain wrong (see my
> audit paper in ZooKeys), having passed through formatting filters
> insisted upon by the data storers.
> 
> I started this thread by saying that clean-bucket resource builders
> could use a webpage markup system that allowed people like Tony to
> harvest webpage contents more easily, for uses other than those for
> which the clean buckets were designed. I'm sure Tony would be happy to
> see that happen. An alternative is for clean-bucketers to add all their
> edits and novelties to the dirty buckets, where they'll get lost among
> the rubbish. Not me, thanks, but please remember that Wikispecies and
> MoA are open-access and CC. If you really want the data, you can get
> it.
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Agricultural Science, University of Tasmania
> Home contact:
> PO Box 101, Penguin, Tasmania, Australia 7316
> (03) 64371195; 61 3 64371195




More information about the Taxacom mailing list