[Taxacom] iSpecies with Wikipedia

Thompson, Chris Chris.Thompson at ARS.USDA.GOV
Thu Mar 27 10:43:59 CDT 2008


The problem Brian is trying to convey is that there need to be one good
answer for end users IF end users are going to make good decisions.

Consider what the Bush team is already doing about endangered species,
global warming, etc. If you allow them any classification they like,
then there will be no endangered species, no global warming, etc. Just
like Mister Bush's mission accomplished four and half years ago and our
booming economy now.

Sorry, Rich, scientists may argue among themselves but they need to
deliver good, well-supposed consensus-based answers. And for us, that is
a single classification.

>From Washington

F. Christian Thompson
Systematic Entomology Lab., ARS, USDA
c/o Smithsonian Institution MRC-0169
PO Box 37012
Washington, D. C. 20013-7012
(202) 382-1800 voice
(202) 786-9422 fax
www.diptera.org Diptera Website

-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu
[mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Richard Pyle
Sent: Thursday, March 27, 2008 10:42 AM
To: TAXACOM at mailman.nhm.ku.edu
Subject: Re: [Taxacom] iSpecies with Wikipedia

Brian Tindall wrote: 

> Many end users would like one answer.

And if they're "Feeling Lucky", then there's no reason they can't have
answer.  An algorithm to assess the existing set of classifications and
allow some sort of consensus to emerge is relatively trivial, comapred
existing ranking algorithms used by search engines and such.  Such an
algorithm would track things like where different classifications are
congruent, and to what extent; weighting based on how many different
knowledgable users have individually ranked the different
weighting based on how many publications have emulated which
(and where those were published, and when); and a bunch of other various
factors that should be too hard for a group of clever algorithms and
bioinformatics folks to hash out.  And, of course, the algorithm could
tweaked iteratively over time.  With such an algorithm, you could not
provide an "I'm Feeling Lucky" classification, but could also provide a
confidence metric for each node based on homogeneity/heterogeneity of
various existing classifications.

On the scale of obstacles separating us from bioinformatic utopia, I
that developing and implementing such an algorithmic approach is down
the "solvable" end of the spectrum.

> But what is that consensus based upon and what happens if the 
> experts generally agree that "the preferred classification" 
> proposed by these providers is misleading? 

Hmmm...who are the "experts", if not the asserters of classifications?
only listed the "big" classification asserters because they are broad in
scope, and familiar to most Taxacom readers.  As I said, I don't see why
there can't be as many classifications as there are
individuals/organizations willing to assert them.  Experts are
and if they are willing to assert classifications, then they are all
part of
the mix.  And their "expertness" would be reflected in the weighting
algorithm for generating the "IFL" classification (by whatever metric
deems appropriate for representing "expertness").

> Harvesting and 
> indexing across multiple websites doesn't serve to correct 
> such problems. It just multiplies them and gives the 
> impression that the majority is correct and the minority (in 
> this case the experts) is wrong!

That's why the algorithm isn't at the "super easy" end of the spectrum
just near the "solvable" end.  Besides, there is no reason why any
individual expert's assertions of classifications couldn't be made
through the internet, provided an appropriate platform and the
software -- exactly the sort of platform and software that (I hope) EoL,
GBIF, and others are planning to develop (based on TDWG standards and
protocols that are already in development).

> I just wonder what the ratio is of the figures "knowledgeable 
> users" to "less-knowledgeable users" to "average users who 
> are looking to the web to provide an answer"? Is it 1:1:1, 
> 100:10:1 or 1:10:100?

I don't think it matters what the ratio is -- as long as tools to
accommodate the different needs are in place.

> I share some of Mary Barkworth's reservations.

So do I -- which is exactly why I advocate the approach that I just
described.  Such a system helps tear down the "authoritarian wall" that
inevitably forms by traditional approaches to establishing "accepted"


Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
  and Associate Zoologist in Ichthyology
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org

Taxacom mailing list
Taxacom at mailman.nhm.ku.edu

More information about the Taxacom mailing list