[Taxacom] a looming data conflict crisis in bioinformatics?

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Fri Nov 19 20:39:10 CST 2010

the more I do of this stuff, the more I think that a data conflict crisis is 
looming in bioinformatics. I know that being concerned by data quality threatens 
to make me rather "unpopular", but what the heck ...

one contributing factor to the potential crisis is, as I have recently pointed 
out, the lack of verifiability associated with many checklists, catalogues, and 
specialist databases ...

in the absence of verifiability against the published literature, only 
taxonomists can sort out data conflicts, but since they are busy doing primary 
taxonomy, I doubt if they will be able to keep up with the growing cascade of 
data conflicts generated by secondary sources like biodiversity databases, etc.

some of the data conflicts arise in the following way:

there are no rules to say that the most recently published opinion must be 
followed, and indeed such a rule could lead to great instability, given the lack 
of a well defined distinction between taxonomic and "grey" literature. It would 
be absurd to change an almost universally accepted classification just because 
some silly sod misinterprets some primary taxonomic literature and publishes 
that misinterpretation in his "checklist of XXXs from YYY". The problem is that 
bioinformatics seems to have difficulty distinguishing the silly sod from a bona 
fide taxonomist voicing actual taxonomic opinions. Occasionally, a really good 
taxonomist can at the same time be a silly sod when it comes to checklists and 
bioinformatics, which complicates matters further.

I am increasingly worried about the growing amount of data conflict that I see 
between the growing number of secondary sources of biodiversity information ...

One thing seems clear: the only hope to solve it is to allow everyone to have a 
say on each specific issue, rather than factionization/exclusion. This is one 
advantage of the wiki system ...



