[Taxacom] encyclopedia of life

Richard Pyle deepreef at bishopmuseum.org
Tue May 15 00:12:58 CDT 2007

I sent my previous post before reading Doug's: 

> Yes, I see that the EoL can deal with this, in part, by the 
> use of a "novice view" interface versus an "expert view" 
> interface, but I must stress that it is indeed only in part.

Why only in part?  And, I don't see the solution as a simple two-alternate
views approach.  I'm imagining something much more sophisticated -- along
the lines of a clever Google-like "PageRank" sort of algorithm, where each
name has some sort of confidence rating emerging from a robust (and
open-source) algorithm based on usage patterns (in this case "usage" refers
to automated weighted assessments of treatments in peer-reviewd and
non-peer-reviewed publications, databases, and other sources). The
"consensus" view would be akin to Google's "I'm Feeling Lucky" feature, with
some sort of color-coded confidence indicator of the stability and little
"+" expanders for successively more information about alternative views.
And, of course, there would be the option to select a "meta-authority"
(e.g., ITIS, Species2000, FishBase, some recent monograph, whatever) as your
preferred view, and instantly switch between alternate views. Algorithms
could canvas these varied meta-authorties and produce classifications akin
to consensus trees in phylogenetic analyses; and could even "PageRank" the
meta-authorities for any given taxon according to the extent to which
different commentators (each with their own algorithmically-derrived
credibility rating) have agreed or disagreed.  Do such algorithms exist? I
doubt it.  Could they be developed? Almost certainly.

> Further, what happens given that most fritillaries have at least 2 or
> 3 subspecies? Does the EoL give a separate page for each subspecies?

Don't think in terms of static "pages" -- think in terms of dynamically (and
automatically) assembled "table of contents" for each taxon; accessible
through robust notification services to experts with the capability of
correcting mistakes and fleshing out content with just a few mouse clicks
and keystrokes. I don't know that it will ever get to that stage; but then
again, I don't know any reason why it shouldn't.
> What we NEED is a mechanism to 
> resolve taxonomic disputes! 

Well, with our nifty algorithms in place, the distputes can be not only
identify, but quantified, and collective access to all digitized information
relevant to the disputed groups can be quickly and efficiently exchanged
among interested taxonomists.  I doubt this will lead to the resolution of
all disputes, but I bet it would help.

> Is it reasonable to expect that you are going to be able to 
> "mine" enough original literature and automate the process 
> enough to build pages for all the taxa for which there are no 
> participating authorities? 

I think so, yes.

> And if you mine and automate, who 
> is going to check to make sure that the mined data are not in 
> error? 

All of us, while using new data management tools designed to make our
day-to-day jobs easier.

> Or is the idea that 90% of the 
> pages in the EoL *will* be generated automatically, without 
> any human input? 

It's all human input -- even the scanned literature.  I suspect that the
rate of new error introduced by the ever-improving (through iterative
learning) processing techniques will ultimately prove to be lower than the
error rate already extant (through human error) in available literature.

> Ultimately, then, I hope that the EoL can attract enough 
> money to start paying for content delivery. 

My hope as well, for sure.


More information about the Taxacom mailing list