[Taxacom] data quality vs. data security: a survey
deepreef at bishopmuseum.org
Sat Feb 13 15:50:31 CST 2010
I guess we just have different perspectives of where the "serious"
duplication-of-effort problems are in our community.
Speaking as a hard-working, dedicated taxonomist myself (when I can find the
time...), I want to make sure that any contribution I make to taxonomy is
maximally available to all current and future interested parties. You and I
just seem to have different ideas about how best to achieve that goal.
> -----Original Message-----
> From: Stephen Thorpe [mailto:s.thorpe at auckland.ac.nz]
> Sent: Saturday, February 13, 2010 11:45 AM
> To: Richard Pyle; 'TAXACOM'
> Subject: RE: [Taxacom] data quality vs. data security: a survey
> Hi Rich,
> No, I don't buy it!
> >Everytime information about a species, a taxonomic publication
> >citation, etc., etc. is typed by humans on a keyboard (whether it be
> >typed into a manuscrapt, a database, a wikispecies page, or
> >that's duplication of effort. Individually, it seems trivial
> -- but in
> >aggregate it is most certainly *not* trivial
> First off, if someone types a citation into a wikispecies
> page, it may in some sense be a duplication of effort if
> someone else has already typed it into something else, or an
> "acronym" or ten have already "harvested" it, but since it
> was typed into wikispecies free of charge, it isn't a SERIOUS
> duplication of effort (on the part of the wikispecies
> contributor). What is a SERIOUS duplication of effort is when
> science funding goes individually to several different
> aggregators to each put the citation in their own particular
> database, and even worse when all they are in fact doing is
> "harvesting" the information from an existing taxon specific
> database. The aggregators are merely parasites ...
> >While there is certainly some overlap among them, the
> duplication is by
> >no means "massive". To say so reveals a poor understanding
> about what
> >these different initiatives actually do
> I may not know what they do (behind the scenes), but I know
> what they give the end user, in terms of content, and it just
> isn't very much at all, at least for GBIF, EOL, COL, and the
> like. All they do is "harvest" names and create stubs. I
> don't want a nice looking map of the world on a species page
> if there are no points plotted on it, or if there are so few
> points plotted compared to the actual distribution. How
> "massive" is "massive", in terms of overlap?
> >You seem to be confusing "Aggregation" with "Integration".
> Google is
> >an aggregator (an indexer, really -- like GBIF)
> OK, so why do we need GBIF, when we already have Google? I am
> NOT, obviously, saying that Google is sufficient for all our
> needs - far from it! I am saying that an expensive entity
> like GBIF is not much better than Google.
> This seems to be what is going on: dedicated taxonomists
> (like Bob, for example) work darn hard for relatively little
> reward, creating new taxonomic knowledge. Then, if you are
> lucky, that knowledge gets integrated into either a taxon
> specific database, and/or (if I have anything to do with it)
> Wikispecies. So far, so good. It is what happens next that is
> the problem! Increasing numbers of "parasites" then make far
> more money and have a far easier life than Bob by
> "harvesting" the names from the taxon specific databases, and
> creating skeleton pages on some site that promises so much,
> but never seems to end up delivering much in terms of actual
> content! If you could get actual useful content out of these
> sites, then fine, but all too often you just find a map
> devoid of points, and a page devoid of content!
> From: taxacom-bounces at mailman.nhm.ku.edu
> [taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Richard
> Pyle [deepreef at bishopmuseum.org]
> Sent: Sunday, 14 February 2010 7:47 a.m.
> To: 'TAXACOM'
> Subject: Re: [Taxacom] data quality vs. data security: a survey
> Hi Stephen,
> > OMG! Did you really just say that! How is a massive duplication of
> > effort increasingly allowing a massive reduction of
> > redundant/duplicate effort????????
> It appears you didn't understand my post. As you say,
> "communication is a very difficult thing, particularly on
> topics as complex as this", so I'll try again. You seem to
> characterize all the various large-scale data aggregators
> (GBIF, EOL, COL, ALA, etc.) as "massive duplication of effort".
> While there is certainly some overlap among them, the
> duplication is by no means "massive". To say so reveals a
> poor understanding about what these different initiatives actually do.
> Everytime information about a species, a taxonomic
> publication citation, etc., etc. is typed by humans on a
> keyboard (whether it be typed into a manuscrapt, a database,
> a wikispecies page, or wherever), that's duplication of
> effort. Individually, it seems trivial -- but in aggregate it
> is most certainly *not* trivial.
> > INTEGRATION is one thing, but MULTIPLE INTEGRATION
> INITIATIVES leading
> > to numerous clone or near clone integrated databases is completely
> > self-defeating!
> You seem to be confusing "Aggregation" with "Integration".
> Google is an aggregator (an indexer, really -- like GBIF).
> The DNS system is an architecture for integration. The
> equivalent of DNS for biodiversity information is what I mean
> by integration.
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> The Taxacom archive going back to 1992 may be searched with
> either of these methods:
> (1) http://taxacom.markmail.org
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
More information about the Taxacom