Let's take a deep breath, stand back, and look at this again, shall we? The key points are getting obscured somewhat by ranting.

GBIF harvests data from various data providers, many of which are already freely available online. In these cases, we have a dilemma: on the one hand, it is convenient to have one place to go for data from various providers. But the data providers are typically kept more up-to-date than GBIF (which only harvests data occasionally). One solution would be for GBIF to simply point to data provider sites (so you could look up a taxon on GBIF, and it would tell you where to go for data). But, then GBIF would be very simple, and not much better than Google! So instead, GBIF offers some sort of "standard data output" for the various data providers, and some sort of overall analysis of all the data, though the details are a bit "vague", and the overall analysis may break down for multiple data providers of varying quality. One crucial point is that GBIF in no way "validates/confirms/annotates" data from data providers. There is no "quality filter" (or even
 "quality assessor") which can be standardly applied to all data harvested by GBIF.

I guess the obvious question now is "so what"? Well, given the amount of funding that GBIF chews up, we really must ask ourselves if it is money well spent? Who uses GBIF and why? I have already offered an answer to that one (i.e. it is effectively a great big bureaucratic "placebo").


 Donat Agosti wrote:
 "I feel, the discussion
 is too much centered on data that has not the information
 content needed, like studying a Landsat image at 30 meter
 resolution and discussing what tree species is
 metaphor! For most scientific uses, you need much more data
 than is provided by any available database. Can you get
 everything you need online? No. Do existing aggregators like
 GBIF offer a helpful starting point? For some people and
 some uses, yes.
 But now the
 important question: when you have all the information you
 need, and clean it and enrich it, do you publish it online
 in a usable form? I don't know what Quentin Groom's
 project was about, nor do I know if he published his final
 In my own case, every
 one of my 12123 locality records for Australian Millipedes
 is freely available in CSV format (and in abbreviated form
 in KML) from the 'Millipedes of Australia' website.
 This store is larger and more up to date and contains fewer
 errors than any aggregator store, or even, the combined data
 providers' stores (because certain providers have been
 slow to add my edits to particular records, or to upload
 them to their own or aggregator stores).
 But if people like me and Quentin publish data
 freely to the Web and aggregators don't use this
 improved/extended data, aggregation looks less and less
