[Taxacom] GenBank (was The economics of biodiversity database initiatives)

Adam Cotton adamcot at cscoms.com
Mon Oct 28 06:56:52 CDT 2013

----- Original Message ----- 
From: "Rafaël Govaerts" <R.Govaerts at kew.org>
To: "'Roderic Page'" <r.page at bio.gla.ac.uk>; "Taxacom" 
<taxacom at mailman.nhm.ku.edu>
Sent: Monday, October 28, 2013 6:08 PM
Subject: Re: [Taxacom] The economics of biodiversity database initiatives

Dear Rod,
The difference is that GenBank provides real objects, namely the sequence 
while GBIF only provides an interpretation of an underlying real object (the 
specimen or observed organism). If images of the specimen or observed 
organism were attached to each record the data would become of real use.
With types there has been a lot of effort towards that as everyone realised 
that knowing the type is Smith 3672 is of little use, you need an image as 
now provided by Jstor for many taxa.

My major gripe with GenBank is the unverifiability of the taxa that have 
sequences listed in the website.

GenBank neither provides accurate locality data (only country - which is a 
political, not natural entity) nor determination beyond species, or even 
photographs which could be used to verify the origin of the sequence.

Here in Thailand (and many other countries) we often find different species 
and subspecies in the various parts of the country which correspond to 
different zoogeographical regions. If I access a sequence in GenBank from 
'Thailand' it is impossible to verify which part of the country the specimen 
came from, and thus which actual taxon has been analysed. This is important, 
since taxa are regularly split or lumped as taxonomic research progresses, 
and the sequence in GenBank will thus become redundant if the original taxon 
is not verifiable.

There is absolutely no way to verify that a sequence belongs to taxon A 
rather than taxon B, other than the say-so of the researcher submitting the 
sequence to GenBank. I know of at least one DNA analysis paper in my own 
field where some taxa were misidentified at species level, and these are 
relatively easy to identify from photographs compared with many organisms. I 
spotted one error just by looking at a black and white photocopy of a plate 
illustrating some of the specimens analysed in that paper, and most papers 
do not actually illustrate every specimen analysed.

If GenBank included locality data and photographic evidence of the taxon it 
would be much more useful long term. In not so many years from now it will 
no longer be possible to verify the identity of sequences by e-mailing the 
author to ask them!


More information about the Taxacom mailing list