[Taxacom] Privacy laws and Science [SEC=UNCLASSIFIED] [ Scanned for viruses ]
p.kirk at cabi.org
Mon Jun 19 01:48:16 CDT 2006
There is another 'element' in the debate about data quality (and I fully support the post by Jim regarding explicit taxonomic concepts as a critical part of any identification) and that is the difficulty factor. Accepting that cryptic species can screw it up for everyone except molecular biologists ... ;-) ... every taxonomist and his dog can identify the giant panda, giant redwood, duck billed platypus, aardvark etc but most species are not as easy as this so ...
The quality of an identification can be assigned a value based on the competence of the identifier at the point the identification was made and the difficulty factor for the taxon identified.
Users outside taxonomy who want taxonomic data must realize that providing data with is fit for purpose is not a simple task. Would these users, having been accused of a serious crime, think that they can just hire a lawyer and instantly march them into court to defend them? NO, there would be weeks or months of preparation time, researching circumstances, case law etc. And so it is with some questions of taxonomy - there is no quick fix, they require time for research to make sure the data set is well briefed to answer the relevant questions.
As you can see I'm in a sort of grumpy and adversorial mood today so I guess my identification rating will be down a notch or two ... ;-)
From: taxacom-bounces at mailman.nhm.ku.edu
[mailto:taxacom-bounces at mailman.nhm.ku.edu]On Behalf Of
taxacom2 at achapman.org
Sent: 19 June 2006 02:06
To: taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Privacy laws and Science [SEC=UNCLASSIFIED] [
Scanned for viruses ]
I wouldn't diagree that users "need to be aware of the limitations of their data" and it is for this reason that we need to provide documentation on the specimens that help them to identify the limitations and to select the data that is of value to their particular use.
If I am running a species model where I may be using hundreds (if not thousands) of specimens - I can't individually check the identification of every specimen - I have to rely on the data I am being given and on the metadata associated with those data. The quality of the taxonomic identification is just one part - albeit perhaps the most difficult to document. For the spatial part we are coming up with some good standards (not taht everyone is or will use them) that will give a Maximum Uncertainty radius. I can then say - OK - anything that has a Maximum Uncertainty of >5km is no good for my analysis - I can discard those. But without the metadata I won't know if the record that says USA and has an x,y coordinate in the geographic centre of the USA (yes - there are such points) is not an accurate point in the middle of the USA - when in reality I am dealing with a coastal fish species.
Taxonomic identification is no different - using my suggested list - I might discard anything that didn't have 'high certainty' from a regional or world expert for my particular analysis. That is my judgement to make based on the metadata and documentation supplied. I can do that easily with one short SQL statement. But if that information is not available - I have to make an individual judgement on each case that 'Fred Bloggs' knew what he was talking about. For the general user without documentation on quality - the data becomes of less value and may not be used at all. I am sure that we all want our data to be used to the maximum extent possible for the benefit of the environment and mankind generally. We have to cater for the non-taxonomist - it will pay off in the long run by doing so.
>From Mary Barkworth <Mary at biology.usu.edu> on 18 Jun 2006:
> Such a scheme is comforting, perhaps, to the non-taxonomist - but it
> would also be misleading in many instances. Perhaps we should face up to
> explaining that identification is not a simple task - in addition to the
> fact that taxonomists are humans. Users of data. all data, not just
> specimen data, need to be aware of the limitations of their data. It is
> an essential part of any analysis.
> A side issue: this discussion has made me realize that some of our
> specimens will show that they were identified by "Synonymy", an expert
> in many taxa. This is because our database reflects how we file
> specimens, not necessarily the most recent annotation. In general, we
> accept the annotations - but not necessarily the taxonomy its name
> reflects. Has any one addressed this issue in posting information from
> their specimens?
> From: taxacom-bounces at mailman.nhm.ku.edu on behalf of
> taxacom2 at achapman.org
> Sent: Sun 6/18/2006 6:17 PM
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] Privacy laws and Science [SEC=UNCLASSIFIED]
> One of the reasons I am proposing a simple taxomomic verification
> qualifier is not to replace the determiner and date (and ideally the
> extras than Jim Croft suggests), but to be additional to them. An
> executive summary if you like!
> Most of the respondents to the suggestions on this listserver are
> taxonomists, and they can perhaps make some judgement as to the value of
> the determination using the name of the determiner and date. In many
> cases, they will know the person personally, or at least know of them.
> But increasingly, users of our taxonomic information are not
> taxonomists, and the names of the determiner may mean absolutely nothing
> to them. For our data to be of most value to the broader community, we
> need some level of quality determination that the non-taxonomist users
> can use to decide quickly if the data is likely to be fit for their
> Also, in spite of whether the privacy laws may or may not apply to our
> data and the determiner's names, there is a perception out there by some
> collection institutions (as evidenced by the responses to the GBIF
> Survey alluded to earlier) that it does apply to them and they are thus
> NOT currently making the names of living individuals available.
> There are a lot of data now being made available via the GBIF Portal,
> and we have very little indication of the quality of many of those data.
> We need to start documenting our data to make it of value to the users,
> otherwise there is little value in making it available at all. Most of
> the users are not taxonomists and determiner 'Fred Bloggs' means nothing
> to them - important though that information is to other users.
> Taxacom seemed to strip off my earlier attachment - so for those that
> have not accessed the GBIF documents, below is a brief summary of my
> suggested qualifiers
> A-1 identified by World expert in the taxa with high certainty
> A-2 identified by World expert in the taxa with reasonable
> A-3 identified by World expert in the taxa with some doubts
> B-1 identified by regional expert with high certainty
> B-2 identified by regional expert with reasonable
> B-3 identified by regional expert with some doubts
> C-1 identified by non-expert with high certainty
> C-2 identified by non-expert with reasonable
> C-3 identified by non-expert with some doubts
> D-1 identified by the collector with high certainty
> D-2 identified by the collector with reasonable
> D-3 identified by the collector with some doubts
> U unknown
> I have also taken on board some suggestions, especially as to date.
> There was also the suggestion that the World Expert may have made a
> determination 10 years ago, and since then, the species has been
> transferred to a different genus, however the specimen has not since
> been re determined. How should we handle these cases, etc. There may
> be an implied re-determination here.
> Keep up the discussion - I think it is a valuable one to have
> Arthur D. Chapman
> Australian Biodiversity Information Services
> Toowoomba, Australia
> >From Jim Croft <jrc at anbg.gov.au> on 18 Jun 2006:
> > Karen et al.
> > I fully agree with these sentiments, but they really do not go far
> > enough to
=== message truncated ==
Taxacom mailing list
Taxacom at mailman.nhm.ku.edu
More information about the Taxacom