[Taxacom] Fwd: complete list of all species

Chris Frazier cfrazie at unm.edu
Thu Feb 14 12:37:28 CST 2008

Let's take a stab at merging the thread on checklists and on a complete list of all species.  

First off Roger was right bringing up the notion of the "intention" of a checklist although I disagree with his conclusion that it cannot be defined. According to the OED, a checklist is "a (complete) list for reference and verification."  I would argue that a checklist is inherently derivative; it is based on a catalog, an uncataloged body of work, or derived from some other source(s) tied to more information about the objects on the list.  A checklist is designed to further investigation, observation, or exploration of the objects on the list.  BTW, a catalog, also according to the OED is a "list, register or complete enumeration; now specifically one systematically or methodologically arranged, often with brief particulars or descriptions aiding identification, etc."  The two are similar, but the purpose of a catalog is complete enumeration with or without details about the objects enumerated while the checklist is not necessarily complete; it is a hypothesis put out there for verification typically by observation.

When we are talking about species lists, there are three other similar concepts are out there that are often confused.  A database is simply a collection of information.  You can database a catalog, you can even database a checklist, but the purpose of the database is to store and provide access to the information.  A database differs from a catalog in that it allows access to more information than in the form of an enumeration.  A database of species accounts is more than a catalog of species.  An index is an extraction that points back to more information about the terms that are indexed.  The indices of nomenclators typically point to the literature source for that name and/or to its placement in a taxonomic hierarchy.  A taxonomic hierarchy is a classification, really a hypothesis, that can be databased, that can be used as the source for a checklist, or that can be extracted for one or more indices.  A taxonomic hierarchy can also be thought of as a structured enumeration, i.e. a catalog, of life itself.

Back to Remsen's original "checklist" post and some of the discussion found in the complete list of all species thread.  The central issue, I believe, is not what constitutes a checklist, but rather what a name on such a list represents and how to appropriately combine these lists.   The true nomenclator indices are lists of scientific names applied to organisms – "without any assertion of systematics or inclusion of auxiliary biological information" (Report on GBIF Nomenclator Workshop, http://www.gbif.org/prog/ecat/docu/reports/report_nomenclators.htm) .  Relatively speaking, the easiest task in collecting "a complete list of names" is to literally collect the names themselves.

When the "index" ties names to a taxonomic hierarchy, the "name" is now both a label and a placeholder for a hypothesis concerning taxonomic position.  Such a "catalog" of life cannot be complete without a completed hierarchy.  Good luck with that.  It gets a lot worse when you want a list of names including synonyms.  The idea of the "right" name means that the name is tied to an assertion of what the real biological entity is or should be as well as ties to the rules of nomenclature.  We all know that there's a huge mess of work involved here.  We are making inroads toward such as catalog, its just that we in the business know how imperfect the catalog is going to be for a long while.

The standard geographic "checklist" gives a list of taxa that exist in a region.  Now the names contain an implicit reference to a taxonomic concept as above and an assertion that a population of that entity exists within the region (or at least one individual now, previously, or potentially).  Putting all such checklists together gives us a catalog of life and where all those critters and such are (or could be).  To put this type of catalog together each name has to be considered with respect to three different aspects:  The name itself (think of this as how a name on a given regional list matches up to the ultimate index of taxon names), a taxonomic concept (how the purported entry on a list relates to entries on other lists and from various different sources with the same name or potentially synonomized names), and as a label for the population in that region (is it really there, for instance, is it common or rare, and other attributes ascribed to the named population).  

A fourth area of interest is how the linkage between that name and that population is documented (anecdote, documented observation, or vouchered).  Putting all of that together and we get the ultimate ultimate catalog of life: a list of all taxa in the world, the relationship among them, other alternative or discarded taxaonomic concepts and how they apply to the list, the distribution and status (i.e. rarity) of all taxa (and populations) in the world, and clear pointers to the evidence we have for their distributions and status.  Oh and while we are at it we really do have to throw in the temporal component; where they are now versus in the past (or where they were in the past for extinct species).  While I believe we have to consider this catalog as a goal and one certainly worthy of producing, we should should also be clear that you are not going to find this uber-catalog on Google anytime soon.



