[Taxacom] Chameleons, GBIF, and the Red List

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Mon Aug 25 02:39:51 CDT 2014

But Rich, you are saying that from the perspective of a sophisticated user, who can see and deal with problems in data quality (at least in your fairly narrow area of expertise/interest). I'm thinking more about the sitation that naive users face, and they are certainly in the majority.


On Mon, 25/8/14, Richard Pyle <deepreef at bishopmuseum.org> wrote:

 Subject: RE: [Taxacom] Chameleons, GBIF, and the Red List
 To: "'Stephen Thorpe'" <stephen_thorpe at yahoo.co.nz>, "'TAXACOM'" <taxacom at mailman.nhm.ku.edu>, "'Donat Agosti'" <agosti at amnh.org>
 Cc: "'Bob Mesibov'" <mesibov at southcom.com.au>, "'quentin groom'" <quentin at br.fgov.be>
 Received: Monday, 25 August, 2014, 7:35 PM
 Hi Stephen,
 Let's say there are half a
 million bogus records in GBIF.  That's 1% -- which is
 MUCH better than most datasets!  The truth is, there could
 be fifty million bogus records, and it would STILL be
 cleaner than a lot of datasets.
 A lot of this discussion is very silly.  The
 answer is easy:  If you find the data in GBIF useful, then
 use it.  If not, then don't.  As Donat pointed out,
 the amount of money GBIF uses is little more than a rounding
 error compared to the global biodiversity research budget;
 so this nonsense about "money could be better
 spent" blah, blah, blah, and "GBIF was built by
 bureaucrats for bureaucrats" blah, blah, blah is just
 plain looney.
 Having said
 that, I think this thread has been very useful and
 interesting overall.  It's really about clearing up
 misunderstandings, and thinking through plausible solutions
 to real-world biodiversity data problems.
 > -----Original
 > From: Taxacom [mailto:taxacom-bounces at mailman.nhm.ku.edu]
 On Behalf
 > Of Stephen Thorpe
 > Sent: Sunday, August 24, 2014 9:11 PM
 > To: TAXACOM; Donat Agosti
 > Cc: Bob Mesibov; quentin groom
 > Subject: Re: [Taxacom] Chameleons, GBIF,
 and the Red List
 >What is the geographic resolution? What is the taxonomic
 > >This then would define
 data quality much more accurately<
 > I find it hard to take such fine
 grained data quality issues seriously, when
 > GBIF has records like the bark louse
 Peripsocus maoricus classified as a virus
 > (and attributed to a data provider, NZOR,
 which doesn't say that!) How many
 more bogus records are there in GBIF? Tens, hundreds,
 > On Mon, 25/8/14, Donat Agosti <agosti at amnh.org>
 >  Subject:
 RE: [Taxacom] Chameleons, GBIF, and the Red List
 >  To: "TAXACOM" <taxacom at mailman.nhm.ku.edu>
 >  Cc: "quentin groom" <quentin at br.fgov.be>,
 "Bob Mesibov"
 > <mesibov at southcom.com.au>,
 "Stephen Thorpe"
 > <stephen_thorpe at yahoo.co.nz>
 >  Received: Monday, 25 August, 2014, 6:53
 >  There are
 three points
 >  that appear again and
 again and need some thoughts.
 >  Money spent on a useless
 >  institution
 >  I think it is a
 myth that GBIF is siphoning off a lot of funding for 
 biodiversity (taxonomy). I
 > rather
 argue, if there is no  GBIF, a lot of money would not be
 spoken for the
 > biodiversity community
 at all. GBIF established itself, as a  rare kind, from
 > with the OECD, a long tedious process
 that  people like Jim Edwards,
 Meredith Lane and Ebbe Nielsen  initiated and went through
 long gruesome
 > meetings. It is  also
 partially enabled through the fact that the Convention 
 > Biological Diversity  asks the
 parties (countries) to  build biodiversity
 > observation systems (see eg two recently 
 EU-funded projects EU-BON and
 pro-iBiosphere, the former  being requested from the EU to
 build a the
 > European leg of  the
 Global Biodiversity Observation Network the latter got
 > funded in competition with all sciences
 and thus should be  regarded as a
 resource that has been successfully created  from
 >  But
 >  taxonomists do a very bad job in
 creating such
 >  opportunities: This
 current discussion just confirms it. We  complain about
 > bad data - data we create and should
 >  At the same time we are not
 able to provide a reference list  of all the
 > species in the world. We are not able to
 provide  a response to the need to
 build the bases for global  biodiversity monitoring system
 as bases for
 > biodiversity 
 conservation. This goal would define the quality of the
 data  we
 > need, and will have to
 >  Even
 if GBIF would compete for resources:
 Compete against whom? Stephen Thorpe? An institution? GBIF 
 is a global
 > institution that has
 members (States) and thus  competes at a global level.
 > The global level though are more  than
 3400 herbaria worldwide, an unknown
 number of natural  history museums, an unknown number of
 > awarded in the  area of
 biodiversity. Is this money that GBIF gets really  that
 > big, or rather a small fraction? It is
 rather the  latter.
 >  In a sense, the
 discussion we are having regarding the use of GBIF is very 
 similar to the
 > discussion on the
 usefulness of the United  Nations. There are plenty of
 > out there that consider  this a
 waste of money. At the same time, if there is
 > no UN,  there is no system that allows
 States to meet, to bring up  such
 obviously unrelated issue and discussions at global  level
 such as Regulation
 > of Small Firearms to
 Biodiversity  (eg Rio Earth Summit). We are global, we
 > cannot deny it even  if many of us want
 to this such as the Climate Change
 deniers, we need global solutions that cannot be found and 
 created at an ad
 > hoc basis.
 >  Look at scientific
 names. Here we do not have a  GBIF that we can complain
 > about. We have some sort of many  one man
 shows that do not talk to each
 > other
 and thus a  critical assessment like the one on occurrences
 is NOT even
 > possible. And this is
 pretty much reflected in, and the  cause of in the
 > complaint about taxonomic correctness in 
 GBIF data. We do not try to get
 > our
 acts together, like  Ebbe, Meredith and Jim many years ago,
 to build such
 > a  system. For example,
 we do not take BHL as a bases to build  a global
 > catalogue of life that brings together the
 names,  bibliographies and the
 published record which would allow  understanding each
 taxonomic name
 > usage, to link the TNU
 to  the original observation record cited and thus build
 > a much  more powerful GBIF. With few
 exceptions (I.e. Pensoft) we  rather
 willingly defend publishing in a very obstructive way 
 (PDF, not Open Access)
 > that inhibits
 building such a system  because of its prohibitive price
 tag to
 > extract this  information
 later, and even to discover this bit of  information.
 >  We really
 >  need to put this discussion about the
 value of GBIF into a  bigger context,
 and use the same criteria over our entire  field of
 biodiversity information for
 > other kind
 of data.
 >  IUCN does a
 better job
 >  IUCN
 with its
 >  Red List has a much more
 direct effect on policy making,  especially in
 > conservation. Therefore, the data they
 use  should be much more scrutinized
 and in fact all the raw data  should be available
 tochallenge the red listing.
 > Can you
 do  this? Why aren't you question their data and
 results as  you do
 > with GBIF? Are you
 happy that you have to be for  access to their data
 > one by one? Is it legitimation 
 enough that you have a monopoly that nobody
 > can challenge?
 >  Are
 you happy with a polygon or with observation data? Can  you
 get all the
 > data the UNEP/WCMC is using
 for their  analyses? No, you cannot in many
 > cases for various reasons,  and thus you
 cannot challenge them in the same
 >  I think, I
 prefer GBIF-data
 >  with all its
 weaknesses as opposed to analyses I just have  to
 > because of an "inner
 circle". I can  challenge GBIF data, complain about
 it, and
 > hopefully use  the lessons
 learned to create a better system for the future
 > that delivers what I want.
 >  Data need be cleaned up
 >  Yes, data need be
 cleaned up,
 >  BUT the best cleaning of
 data doesn't help to create the  data standards that
 > you envision: Properly identified with 
 spatially high and accurate resolution
 data. You only get  out what you put in. You can't
 create a highly precise  GPS
 > read from
 most of label data we have. Also, as much as  we complain
 > data quality, as much should we
 stress to  get metadata that defines the
 > source of the data: Why has it  been
 collected? What is the geographic
 resolution? What is  the taxonomic authority? This then
 would define data
 > quality  much more
 >  Donat
 >  -----Original Message-----
 >  From: Stephen Thorpe [mailto:stephen_thorpe at yahoo.co.nz]
 >  Sent: Sunday,
 August 24, 2014 11:46 PM
 >  To: Donat
 Agosti; Bob Mesibov
 >  Cc: TAXACOM;
 quentin groom
 >  Subject: Re: [Taxacom]
 Chameleons, GBIF, and  the Red List
 >  Let's take
 >  a deep breath, stand back, and look at
 this again, shall we?
 >  The key points
 are getting obscured somewhat by ranting.
 >  GBIF harvests data
 >  various data providers, many of
 which are already freely  available online. In
 > these cases, we have a dilemma: on the 
 one hand, it is convenient to have
 > one
 place to go for data  from various providers. But the data
 providers are
 > typically  kept more
 up-to-date than GBIF (which only harvests data
 > occasionally). One solution would be for
 GBIF to simply  point to data
 > provider
 sites (so you could look up a taxon  on GBIF, and it would
 tell you
 > where to go for data). But, 
 then GBIF would be very simple, and not much
 > better than  Google! So instead, GBIF
 offers some sort of "standard  data
 > output" for the various data
 providers, and some  sort of overall analysis of all
 > the data, though the details  are a bit
 "vague", and the overall analysis may
 > break down for multiple data providers of
 varying quality.
 >  One crucial point is
 that GBIF in no way 
 "validates/confirms/annotates" data
 > from data  providers. There is no
 "quality filter" (or even  "quality
 > which can be
 standardly  applied to all data harvested by GBIF.
 >  I guess the
 obvious question now is "so  what"? Well, given
 the amount of
 > funding that GBIF  chews
 up, we really must ask ourselves if it is money well
 > spent? Who uses GBIF and why? I have
 already offered an  answer to that
 > one
 (i.e. it is effectively a great big  bureaucratic
 >  Stephen
 >  On Mon, 25/8/14, Bob Mesibov <mesibov at southcom.com.au>
 >  wrote:
 >   Subject: Re:
 >  [Taxacom] Chameleons, GBIF, and the Red
 >   To: "Donat
 Agosti" <agosti at amnh.org>
 >   Cc: "TAXACOM"
 <taxacom at mailman.nhm.ku.edu>, 
 "quentin groom"
 > <quentin at br.fgov.be>
 >   Received: Monday, 25 August,
 2014, 9:27 AM
 >   Donat Agosti wrote:
 >   "I
 feel, the
 >  discussion
 >   is too much centered on
 >  that has not the information 
 content needed, like studying  a Landsat image
 > at 30 meter  resolution and discussing
 what  tree species is  shown"
 >   Excellent
 >   metaphor! For most
 >  scientific uses, you need much more
 data  than is provided  by any available
 > database. Can you get  everything you
 need  online? No. Do existing
 aggregators like  GBIF offer a  helpful starting point?
 For some people
 > and  some uses, 
 >   But now the
 >   important question: when you
 have all the  information you  need, and
 > clean it and enrich it, do you  publish
 it online  in a usable form? I don't know
 > what  Quentin Groom's  project was
 about, nor do I know if he  published his
 > final  data.
 >   In my own case, every
 >   one of
 my 12123 locality records for Australian Millipedes  is 
 freely available in CSV
 > format (and in
 abbreviated form  in
 >  KML) from the
 'Millipedes of Australia' website.
 >   This store is larger and more
 up to date and  contains fewer  errors than any
 > aggregator store, or even,  the combined
 data  providers' stores (because
 certain  providers have been  slow to add my edits to
 particular  records, or to
 > upload 
 them to their own or aggregator  stores).
 >   But if
 >  like me and Quentin publish
 data  freely to the Web and  aggregators don't
 > use this  improved/extended data, 
 aggregation looks less and less  useful.
 >  --
 >   Dr Robert Mesibov
 >  Honorary Research
 >   Queen
 >  Museum and Art Gallery,
 and  School of Land and Food,  University
 > of  Tasmania  Home contact:
 >  PO
 >   Box 101, Penguin, Tasmania,
 >  7316
 >   (03) 64371195; 61 3
 >   Taxacom Mailing List
 >   Taxacom at mailman.nhm.ku.edu
 >   http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
 >   The Taxacom Archive back to
 1992 may be  searched at:
 > http://taxacom.markmail.org
 >   Celebrating 27 years of
 >   Taxacom in 2014.
 > Taxacom Mailing List
 Taxacom at mailman.nhm.ku.edu
 > http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
 > The Taxacom Archive back to 1992 may be
 searched at:
 > http://taxacom.markmail.org
 > Celebrating 27 years
 of Taxacom in 2014.

More information about the Taxacom mailing list