[Taxacom] Data query

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Mon Jun 24 23:17:44 CDT 2013


Tony, Part of what you are saying is rather misleading: you make it sound like there are no data structures on Wikispecies! This is wrong! For example, I now routinely create and use reference templates. This means that when you are on the template page, you can click on 'what pages link here' and get a list of all the pages on which that template is used (i.e. all pages/articles which cite that reference). It is the same with authors of taxa, etc., etc.
 
Stephen


________________________________
From: "Tony.Rees at csiro.au" <Tony.Rees at csiro.au>
To: mesibov at southcom.com.au 
Cc: stephen_thorpe at yahoo.co.nz; r.page at bio.gla.ac.uk; taxacom at mailman.nhm.ku.edu; deepreef at bishopmuseum.org 
Sent: Tuesday, 25 June 2013 4:08 PM
Subject: RE: [Taxacom] Data query


Hi Bob...

You wrote:
> If you don't think Wikispecies is valuable, then please say so without
> hanging that opinion on the difficulty of machine harvesting its
> content, which is a red herring.

Nowhere did I say that Wikispecies was not valuable, however its value would be increased if its content were more easily machine harvestable, and its internal rigour (and quite possibly, ease of data entry) would be improved by basing it on an "enter once, use many times" model as per a relational database. In addition it would be possible to vastly extend the present content of Wikispecies with relatively little effort if this system had an easy manner to import already structured, trusted data from other sources as opposed to re-entering everything by hand as is required at present.

Rather than "want[ing] Wikispecies contributors to instead (or also) contribute to each of half a dozen different taxonomic databasing projects" it would be good if someone at Wikispecies could also figure out how to export the data in bulk in a machine-readable form suitable for ingestion by other systems to maximise the value of the work already existing in the Wikispecies entries (my comment that this was not supported now does not mean it will never be available). Actually someone has already done this via a process of their own construction (see http://people.umass.edu/nconstan/LifeTree/) though they encountered problems to do with the fact that the data structure has internal inconsistencies (see discussion on that page). Nevertheless it shows what may be possible, either now or in the future, to leverage present and future wikispecies data compilation activities for other uses, not all necessarily specified or even envisaged at this time. At that
 point it might then be possible to see a role for Wikispecies e.g. as a feeder point for other, larger compilations such as ZooBank (to name but one) thereby reinforcing the value of the "enter once, use many times" maxim.

So I am by no means bagging wikispecies, even though as yet I have not attempted to use it as a data source, and may do at some point (especially if it is better set up to provide data exports, preferably in a standard format e.g. one of the TDWG-supported ones rather than one of its own devising). However in the debate about how best to store taxonomic information, it represents one end on a spectrum of static web pages vs. atomized data stores, so it is on that level that I feel it is legitimate to point out certain weaknesses (we are also aware of strengths too via Stephen's posts).

Regards - Tony


> -----Original Message-----
> From: Bob Mesibov [mailto:mesibov at southcom.com.au]
> Sent: Tuesday, 25 June 2013 1:32 PM
> To: Rees, Tony (CMAR, Hobart)
> Cc: stephen_thorpe at yahoo.co.nz; r.page at bio.gla.ac.uk;
> taxacom at mailman.nhm.ku.edu; deepreef at bishopmuseum.org
> Subject: Re: [Taxacom] Data query
> 
> Hi, Tony.
> 
> You made it very clear in your earlier post what you wanted the data
> for, and I'm not by any means saying those uses aren't important. But,
> again, Wikispecies wasn't designed to (easily) supply you with machine-
> harvested information. It's a human-readable Web resource, full stop.
> 
> So - do you want Wikispecies contributors to instead (or also)
> contribute to each of half a dozen different taxonomic databasing
> projects, most of which don't have 'sandboxes' for doing this? Do you
> want the Wikispecies page designers to make it much easier for machine
> harvesting of information that can then be machine-checked against your
> favourite database? (Not that this would improve everyone's data - look
> at the differences between existing databases and the difficulties in
> synchronising them.)
> 
> If you think what Wikispecies contributors are doing is valuable and
> worth encouraging, then one of those solutions (or another) is worth
> pursuing. At the moment, Wikispecies is a valuable alternative to other
> taxonomic 'endpoint' Web resources. It's often more up to date, and
> although it's not as pretty or as subdivided as the EoL interface (for
> example), it has the huge advantage that all of its page content is put
> together by humans and publicly discussed by humans.
> 
> If you don't think Wikispecies is valuable, then please say so without
> hanging that opinion on the difficulty of machine harvesting its
> content, which is a red herring.
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Agricultural Science, University of Tasmania
> Home contact:
> PO Box 101, Penguin, Tasmania, Australia 7316
> (03) 64371195; 61 3 64371195


More information about the Taxacom mailing list