[Taxacom] the hurdle for all biodiv informatics initiatives
deepreef at bishopmuseum.org
Thu Feb 18 03:46:04 CST 2010
> Taxonomy, and taxon names are the SOLUTION to unlocking the
> world's biodiversity. Any worthwhile indexing effort will
> utilize this solution and build on it. Instead,
> 'bioinformatics' is regarding it as the PROBLEM
Ummmm...who, exactly, is saying that taxonomy is the Problem? I guess if
you mean "bioinformatics" sensu stricto (i.e., in the sense that the
molecular/DNA people have copted that term as their own); then yes -- some
of those people think that taxonomy can be replaced by genetic markers (like
DNA barcodes). And in that sense, I am fully in agreement with you. But my
understanding is that this conversation was about "bioinformatics" sensu
lato (i.e., what we now refer to as "biodiversity informatics"). This is
the space where we find the "Axis of Evil" (GBIF, ALA, EoL, CoL, etc.) --
and the things they are doing are highly supportive of traditional taxonomy,
and they all see taxon names as the SOLUTION to integrating the information.
> Actually, I do not see a real difference between indexing
> "text strings" and indexing the fonts used to print the text
> string (or for that matter, indexing the type of ink used to
> print the text string). There are endless possible "text strings"
> out there (an ever growing number) and these can be indexed
> till hell freezes over, without necessarily achieving anything.
...unless, of course, we can build an infrastructure that goes beyond the
text strings and cross-links data through GUIDs (which humans never see).
As I explained in our off-list exchange, the text-string indexers do not see
the text strings as the "ends", they see them as the only thing we currently
have to build the connections. Once those connections (via GUID links) are
established, then the text strings become consistent, stop multiplying
needlessly, and (ultimately) the text-string indexers will no longer have a
service to perform.
> I agree with the reasoning: computer-readable output for computers.
> What does not make sense is to convert "identical text
> strings" that refer to different entities into identical
> computer output. This is just a way to multiply confusion.
> Text strings are a red herring. Why not index what actually matters?
Indeed -- that's *exactly* what we're trying to do (see my previous post).
Like I said, whether we like it or not, the myriad text strings already
exist, and are our only link in some cases to important information about
biodiversity. To link that information to the "clean" names, we need the
text-string indexers to help us get there.
More information about the Taxacom