ITIS (an explanation of GBIF's data integration activities)

Roderic D. M. Page at BIO.GLA.AC.UK
Fri Jun 25 00:40:33 CDT 2004

>There is one thing wrong with your illustration using an albatross and an
>ITIS number.  If the listing under that ITIS number gives only NODC as a
>source you have a dead end.  There is no published work to which one can
>I think that using an albatross as an example was very appropriate.

I guess from my point of view, this isn't such a problem. If I use
ITIS as source to check a name, then what matters is does the name
exist in ITIS. Where ITIS got it from is less important -- I have to
trust somebody. I have no desire to track every name back to its
original publication (I have some 37,000 names to deal with).

Now of course I can be reasonably careful. ITIS assigns a "Record
Credibility Rating" to each name. I could choose to accept only names
which are verified (Diomedea Linnaeus, 1758 is verified, and there
are publications listed that I could go an read). I could also look
up the same name in other databases and see if the details agree.

In an ideal world, a name database would have links to publications
relevant to a name (e.g., its original description), and these could
be readily accessed if a user wants to check the sources. Some
databases make an effort to do this (e.g., the EMBL reptile
database). But in the real world, this isn't going to happen anytime
soon. There are choices to be made.

Much of my motivation comes from trying to deal with the taxonomic
mess that is TreeBASE (an otherwise wonderful database of
phylogenies). When the designers of TreeBASE started out they put
taxonomy into the "too hard" box (and in the mid 1990s there were
precious few databases around, so this was a reasonable thing to do).
As a result, the names in TreeBASE are a hideous mix of scientific
names, informal names, erroneous names, synonyms, names with GenBank
accession numbers tacked on, etc. This means TreeBASE is crippled --
even trivial questions such as "how many bird phylogenies are in
TreeBASE" can't be answered because TreeBASE has no concept of what a
bird is. Compare this to ITIS, which knows exactly what taxa are

Now, I want a phylogenetic database that incorporates taxonomy, and
the longer I wait the worse TreeBASE is going to get. So, I want to
map as many names in TreeBASE onto names in taxonomic databases (the
more the better) in an effort to figure out what taxa are actually in
the database. Hence, for my purposes I need to be able to map lots of
names, and as a starting point all I need is a name, a source
database, an id from the database, and any information on synonyms. I
find it almost unbelievable that in this day and age this should be a
hard thing to do.



Professor Roderic D. M. Page
Editor Elect, Systematic Biology
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email: at

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:

More information about the Taxacom mailing list