"Can you bank on GenBank?"

B.J.Tindall bti at DSMZ.DE
Fri Sep 3 08:03:29 CDT 2004

Nico addresses two problems:
a) the problem of the reliability of sequences is something which also
affects bacteriology. There are a number of publications highlighting the
problem. There are two issues here. The first is how on Earth can GenBank,
EMBL/EBI etc. check the correctness of sequences (the accuracy of the
sequencing reaction and the identity of the organism used) unless they
repeat exactly the same work. The accuracy of such information is the
responsibility of the scientists depositing the data. The second part is
how to handle data deposited in these databases which is known to be
faulty. In my exchanges with GenBank I was told that the deposited
sequences are the "property" of the depositor and changes can only be made
by the depositor. However, one could also argue that GenBank should respect
the original sequence data, but for curatorial purposes I see no reason why
notes cannot be added indicating errors or other changes. It is a question
of what policy to adopt.

b) the NCBI taxonomy - well I suppose the question is where are the NCBI
going to get the authoratative list of names (and synonyms etc) together
with taxonomic interpretation? Earlier posts on "registration" and
nomenclatural lists being turned into taxonomic lists illustrates some of
the problems. My view point on this topic is that systematics and
nomenclature has been pushed into a corner over the last decades and
science has been caught with its trousers down because databases, such as
GenBank are now fully dependant on indexing their data via nomenclature and
the resulting taxonomy. Strictly speaking I would advocate contacting the
appropriate "authorities" such as those responsible for the Codes of
Nomenclature to seek expert advice at the nomenclatural level and then to
try to work out the resulting taxonomy.

I appreciate your intent, since Bergey's Manual Trust has already
underatken such a task in bacteriology. I can pass on the e-mail address of
the chief editor of the Manual (off list) if you wish. To which I could
also ask why officers of the International Committee on the Systematics of
Prokaryotes are not directly involved with TDWG, since bacteriology has
already trodden this rocky road...... sorry just couldn't resist ;-)

At 19:38 2.9.2004 -0400, Nico Mario Franz wrote:
>That's the title of an article by D.J. Harris (TREE 18: 317-319, 2003) in
>which he reviews the extent of "erroneous submissions" of sequences to
>GenBank. I'm interested in finding out what Taxacomers' experiences are
>concerning the *taxonomy* adopted by GenBank in order to structure the
>GenBank offers some insights here:
>but otherwise often states: "The NCBI taxonomy database is not an
>authoritative source for nomenclature or classification - please consult
>the relevant scientific literature for the most reliable information."
>Is the somewhat rudimentary taxonomy of GenBank starting to become a
>practical problem? What about outdated names, or relationships we now
>"know" to be different? Is there a GenBank taxonomic advisory committee
>trying to address the issues? Are there any documents on this subject?
>Anyone I could contact for an insight scoop?
>Anything from personal rants or laudatios to guides to other sources would
>be highly appreciated. FYI this ties into developments of unique digital
>identifiers for so-called "taxonomic concepts" (TDWG, etc.).
>Nico Franz
