[Taxacom] Chameleons, GBIF, and the Red List
Chuck.Miller at mobot.org
Mon Aug 25 10:47:01 CDT 2014
Re: "I prefer a different model, where data is considered to be "social" and we can all annotate it (in effect, the museums are themselves simply one annotator)."
Are you talking about a "TripAdvisor" or "Yelp" kind of application for biodiversity data records?
From: Roderic Page [mailto:Roderic.Page at glasgow.ac.uk]
Sent: Thursday, August 21, 2014 5:52 PM
Cc: Bob Mesibov
Subject: Re: [Taxacom] Chameleons, GBIF, and the Red List
A couple of quick comments.
Regarding expertise, I agree that there is lots that non experts can do, but also take Doug's point about the value of taxonomic input. I once saw a talk by Charles Godfray we he was describing the role taxonomic expertise played in building maps of mosquitoes that transmitted malaria (see, e.g. http://dx.doi.org/10.1371/journal.pmed.1000209 ). He said the role of the taxonomist wasn't the oft-assumed one of identifying specimens, instead it was to interpret distributional data from the literature in the light of changing taxonomies.
Where I differ from James is that I'm not really a fan of an annotation model where the focus is on annotating data and pushing those annotations back to the "primary providers". Given the scale of the problem, and that evidence is likely to be widely distributed (problems are often only uncovered when data is aggregated from different sources) I prefer a different model, where data is considered to be "social" and we can all annotate it (in effect, the museums are themselves simply one annotator). There's a bit more about this here: http://iphylo.blogspot.co.uk/2014/04/more-on-annotating-biodiversity-data.html Note that I'm not disputing that it would be nice to feed annotations back to collections, but that this isn't the main goal (and it think that it's pretty clear that there is going to be a huge bottle neck involving this process).
Sent from Acompli<http://t.acompli.com/ac_sig>
On Thu, Aug 21, 2014 at 11:52 AM -0700, "James Macklin" <james.macklin at gmail.com<mailto:james.macklin at gmail.com>> wrote:
Sorry, a little slow... I also think it is important to stress the data quality life cycle here. What we still as yet do not do well is connect the expert work done on these specimens or their digital derivatives (or observations, I guess), which are not done by the source/owner, back to them so the source/owner can clean/update the record and provide it to GBIF and/or other aggregators. The literature is one path where there is reference to the specimens used but as we know not everything ends up published this way. Further, extracting the information from the literature can be challenging even today. Lyubomir and Pensoft make this easy (thanks!) but we are still a long way from convincing other publishers to include the specimen data in a readily accessible form (or even mandating its presence as evidence). Another way to get expert knowledge back to the source is through annotation. Those of you who know me realize that my colleagues and I have spent a fair bit of time studying this problem and coming up with solutions (FilteredPush). I would say that in general there are now reasonable solutions for achieving distributed annotation at various levels of complexity but there is still a challenge/bottleneck in pushing these annotations back to the source and into their collection management system. The bottleneck is potentially at the source that must process the annotations. If we automate (or even semi-auto) the annotation process through curation workflows, something my colleagues and I are now focusing on, we could potentially flood the "curators" of the specimens/data. Then the question becomes how much the owners are committed to processing potentially valuable modifications/additions and adding them to their database. Certainly data curation and positions to support it are in their infancy. The annotations that are not processed by the source still have value and can inform the aggregators but have to be dealt with in a slightly different manner. So, this returns to the issue of when GBIF takes in a record update (or a new record), what metadata follows it to say it has been changed (created) based on some form of expertise...
I think we also need to be careful of the use of the term "expert." I think it is reasonable to assume that a taxonomist is not going to be any better at georeferencing a specimen based on the collecting event data (assuming this person was not associated with the collecting event) than a geographer, historian or even a citizen that happens to live near where the event took place. So, in the case of the Chameleon paper, and others like it, the issue really relates to taxonomic expertise and thus the name that appears associated with the record and not the entire record necessarily.
Papers like the Chameleon are quick to judge the end product but do not take into consideration what an achievement it is to simply have a GBIF resource and the challenges the greater "we" have overcome just to get this far! Let's stop highlighting the problem yet again and get to work on solving it and making the GBIF resource more valuable to all ;-)
James Macklin, Ph.D.
Botany and Biodiversity Informatics
Associate Curator of the AAFC National Vascular Plant Collection (DAO) Agriculture and Agri-Food Canada Ottawa, Ontario, Canada
On Thu, Aug 21, 2014 at 6:20 AM, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:
Just to follow up on this discussion:
Stephen, I think I often come across as grumpy, but your cynicism makes me look like a fanboy, so thank you for that ;) Can we maybe assume that GBIF's primary goal isn't to keep bureaucrats happy, that it's genuinely trying to provide access to basic biodiversity information in one place because that seems like a worthwhile goal - leaving aside whether GBIF is the best way to tackle that goal.
Bob, if I understand your argument correctly, it's that access to mostly unveiled biodiversity data isn't much use, and in your view that's mostly what GBIF is serving up. Assuming that it would be nice to have access to good-quality distributional data in one place, what if GBIF provided, say, distributions of species that had been cleaned and had some degree of expert scrutiny. In other words, say a researcher publishes an evidence-based distribution map, what if that was stored on GBIF in a citable form (e.g., had a DOI), and others could download that distribution and make use of it?
I guess this was the thinking behind the now abandoned SDR project (see https://code.google.com/p/gbif-sdr/wiki/PortalIntegration ), and is perhaps where the Map of Life http://mol.org is headed (although at the moment it's simply showing you a bunch of distributions from different sources).
Lyubo, I couldn't agree more, having links to literature related to a record would be great. Many of our online biodiversity databases are devoid of links to the evidence for a particular assertion, but as more and more literature comes online we can do something to fix that. +1 for extracting from the literature, especially if we can automate this at scale (although that will give Bob nightmares).
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk><mailto:Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>>
Tel: +44 141 330 4778<tel:%2B44%20141%20330%204778>
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu<mailto:Taxacom at mailman.nhm.ku.edu>
The Taxacom Archive back to 1992 may be searched at: http://taxacom.markmail.org
Celebrating 27 years of Taxacom in 2014.
More information about the Taxacom