[Taxacom] Data quality in aggregated datasets

Donald Hobern [GBIF] dhobern at gbif.org
Wed Apr 24 03:51:58 CDT 2013

Thanks to Lee Belbin for his comments on this thread.  I believe we all
recognise the issues and what needs to be fixed.  The real question is how
we work together to deliver fully-connected solutions.  


We need an interconnected set of processes and tools that enable everyone to
contribute the knowledge and expertise that they have.  We need to ensure
that every such contribution is preserved and adds to a persistent
interconnected resource.  Achieving this will definitely depend on
aggregator-level solutions and consistent data management that applies
across all taxonomic groups and geographic regions.  It will also involve a
commitment from us all to work collaboratively to manage digital knowledge
of biodiversity.  Al relevant data should be stored and preserved for
posterity without the uncertainty that comes from short-term project funding
and isolated databases.  Everyone should be able to refer reliably and
stably to every piece of data (every nomenclatural record, every phylogeny
or classification, every specimen record, every image, every sequence, etc.)
and we should ensure that we capture every contribution to our understanding
of each data item.  This means that every time an expert such as Bob finds a
problem in any data, it is flagged immediately and wherever possible fixed
immediately, in a way that immediately benefits all other users.


What the ALA and GBIF and other aggregators have done so far is clearly
still a long way from this level of interconnectedness, but we are
definitely making significant progress.  Getting there will involve a great
deal more work, and greater buy-in from relevant agencies, institutions,
researchers and research infrastructures.  The prize will be a digital
knowledgebase for biodiversity that supports much more sophisticated
interrogation and uses than our historical (primarily paper-based)
knowledgebase ever could.


Bob’s paper certainly reminds us of many of the things we still have to do.
Speaking for GBIF, I can certainly say that we are far from complacent over
this.  Our goal is certainly to automate what we can but also to connect
efficiently with everyone who can contribute the additional expertise that
comes from direct knowledge of each taxonomic group.





Donald Hobern - GBIF Director - dhobern at gbif.org 

Global Biodiversity Information Facility http://www.gbif.org/ 

GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark

Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480



More information about the Taxacom mailing list