[Taxacom] WTaxa and data harvesting via CoL, EoL, etc.

Chris Lyal C.lyal at nhm.ac.uk
Fri Jul 29 08:04:34 CDT 2011

Apologies for the issues with names in WTaxa.  We are still in the
process of completing the database, so many of the names are not of
valid species.  The first pass in the project was to enter as many names
as possible, from the secondary literature; we received funds from GBIF
to help us with that.  The second pass is to check the original papers
and correct entries, working from oldest to newest, checking
availability and validity as we go, and this is underway.  We also have
had a problem with the database about displaying links between original
and subsequent combinations, which is the issue that Stephen highlights,
and which is fixed in WTaxa but will not transmit through to CoL until
the next data upload.  We are lucky that we have been able to obtain
some funds though partnership with Species 2000 in an EU project, and
later this year we will be able to use some of those funds to improve
the harvesting from WTaxa to Species 2000-CoL.  The fundamental problem
still pertains - a small number of taxonomists who are working to
complete a large task with insufficient time and resources.  However,
without the 'acronyms' we would not have been able to achieve anything
at all.  

Aside from natural disappointment that despite the rather intensive
efforts of a number of people to capture data and disseminate them the
data are not yet perfect, we might consider several serious questions.

How we develop opportunities for funding data population on a large
scale.  Given the amount of data currently available on the web and the
relatively low investment there has been in data population (leading
many of us to work in 'spare' time on this activity) how do we press the
arguments to finish the job.  There are global level policy agreements
through the CBD that this work should be done, so what are people's
experiences in successful arguments for funding?

Secondly, should we (as taxonomists) should expose incomplete
information (it was a condition of the first grant that we received for
WTaxa that we do so).  I have been in meetings where users were appalled
that nomenclators were freely available, since they were using them as
if they listed only valid names (actually a similar situation to WTaxa
as it currently is), but I guess we would generally agree that
nomenclators are a useful tool. 

Finally, a related point; should we develop a standard means in metadata
of indicating fitness for use of any record or item of data - perhaps
TDWG might consider this. 

This is not an invitation to debate (again) the relative merits of
different means of putting information on the web - we've really done
that to death.  Suffice to say that I know very few idle taxonomists (or
people in CoL, GBIF etc, come to that) - we are all trying to populate
systems with data in the ways we see fit.  Nor is it an invitation to
argue (again) that money obtained by initiatives exploring and
catalysing dissemination techniques should have been spent in a
different way - it wouldn't have been, and our project for one has
benefitted - and we're not alone.   


More information about the Taxacom mailing list