[Taxacom] The economics of biodiversity database initiatives

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Mon Oct 28 17:29:10 CDT 2013

Hi Alastair,
But the problem with CoL is precisely that it is NOT consistent! The same taxa (e.g. species) reappear under different guises (e.g. in different genera), and there is no way of knowing how complete CoL is for any group. It is NOT "the best available single resource ..."

From: Alastair Culham <a.culham at reading.ac.uk>
To: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>; Taxacom <taxacom at mailman.nhm.ku.edu> 
Sent: Tuesday, 29 October 2013 11:26 AM
Subject: RE: [Taxacom] The economics of biodiversity database initiatives

Hi Stephen,

I can see that we differ on this.  To me the value of CoL is not whether or not I agree with the taxonomy used for any individual group but that it is a taxonomic system now used to interrelate data from a range of different biodiversity portals in a consistent way and so make data more discoverable.  It makes no claim to deliver the one correct taxonomic system (should such a thing exist) but it does endeavour to offer a consistent system that others can reference (e.g. annual checklists). The cross mapping tool developed helps to reference one taxonomic system against another and this helps non-expert (and even some expert) users to cross relate data more efficiently.  It's coverage of life is not complete, and is unlikely ever to be, because new species are being named every day, however it is close to 1.5 million accepted named species of about 2 million known and accepted, and covers enough of the world's biota to allow useful large
 scale research projects.  I, as an individual taxonomist, would not claim expert knowledge on 1.5 million species, or even the ca. 300,000 flowering plant species, but I can use the CoL to help me gather data and conduct analyses on groups outside my specialist area because I have sufficient trust in the rating system for data used, and in the panel of experts that deliver their taxonomic opinion largely free of charge for free use by others.  There is an explanation of the rating system at http://www.catalogueoflife.org/col/info/databases. I'm aware of no other initiative that has encompassed such a large body of taxonomic opinion and that is at least partially updated so often.  Both the 4D4Life and i4Life grants included some money to spend on actually doing some of the underlying alpha taxonomic research so i know exactly how difficult it is to get such funding. I'd argue that all of the money was spent on taxonomy, whether delivering
 the infrastructure or revising species groups. However, for much of the development of Catalogue of Life the underlying funding has been like that for many taxonomists, it relied on the goodwill of its host institution to support it between grants.  So CoL is not perfect, makes no claim to be, and is not complete.  However, it remains the best available single resource to find taxonomic opinion on a large variety of living species.  Those records link back to source databases that are much richer in detail than CoL and all entries have a source database for attribution, so you can always check this.  It would be good to make the infrastructure better and it would be good to fund more of the underlying taxonomy.  If there is money to work on the infrastructure I will not refuse that because I don't have money to work on the taxonomy, nor, if there was money for taxonomic revision would I refuse that because I did not have current money for
 infrastructure.  My preference is to develop the two together but money follows trends and the present trend is to mobilize data that are already available.  
CoL is there for those that choose to use it.  I for one would not want to rebuild that from scratch and doubt I could build a system I had more trust in over any reasonable amount of time.  It's a shame you find none of it useful as that suggests you have no trust in the several hundred taxonomists who contribute data.  I would hope that each of them believes in their own work and most have their expertise acknowledged through publication of peer reviewed papers in their specialist groups.  That's the difference between an edited database and a digital aggregator.  However we each make our choices on what to trust, and questioning of the work of others is what drives science forward so doubt is good.
It would be great to have the resource to mark up all the data to be discoverable and automatically cross referenced - species to DNA sequence to distribution to conservation status to morphometrics to cytology etc. and make it all discoverable and open to verification in an automated way by others. But this might look like money for more infrastructure.... 


Dr Alastair Culham
Centre for Plant Diversity and Systematics
Harborne Building, School of Biological Sciences
University of Reading, Whiteknights, Reading, RG6 6AS
Associate Professor of Botany

Curator, Reading University Herbarium (RNG)
Associate Editor, Botanical Journal of the Linnean Society
Programme Director, MSc Plant Diversity
i4Life Coordinator____________________________________________


More information about the Taxacom mailing list