[Taxacom] [tdwg] ESA 2008 Talk Using Semantic Web technologies to tie together disparate data about species .. Unique URI for each taxon

Kevin Richards RichardsK at landcareresearch.co.nz
Sun Jan 25 14:22:49 CST 2009


Pete

I think that a good portion of your comments and questions here will be taking TDWGers and TAXACOMers back to old and very thrashed out arguments.  but it is always good to recap.

I think you will find the TAG and GUID subgroups of TDWG very interesting.  Over the past 5 years or so, it has been a well recognised fact that GUIDs (Globally Unique Identifiers - UUIDs, URIs etc), are a key component to working with global data, integration and reuse of those data.  This is explained on the TAG wiki site (http://wiki.tdwg.org/twiki/bin/view/TAG/WebHome), where one of the main points is the 3 legs of a stool that provide a foundation to a good  architecture - GUIDs, Protocols and Ontologies.

One of the results of these working groups has been the investigation and adoption of a particular GUID type called LSID (Life Science Identifiers).  One of the driving reasons for preference for LSIDs is the fact that they have a degree of separation from "hard coded" URLs, which can all too easily be broken.  The recommended metadata format for responses of resolving LSIDs is RDF.  And as this has shown itself to be a more elegant approach to modeling data and data standards, RDF has been adopted to some degree as the modeling format for some of the TDWG standards.  The major data components of interest to TDWG, including taxon names, have been developed into RDF ontologies and can be seen at http://wiki.tdwg.org/twiki/bin/view/TAG/LsidVocs.  So this will be of interest to you.

I think you are touching on two very different areas of discussion below - one of them is really "what is a species and how do we attach GUIDs to them?", and I think you are covering fairly well in your other thread.  The other area, and probably the easier area to discuss, is purely what GUID and data representation technologies to use for delivering the data we are interested in, and how to handle changes to that data.  Being from an IT background I always view this problem from a data perspective and tend to think of a database record being the object being identified - if you change the database record, and people were relying on the essential details of that record, then we could easily end up with invalidated data - I therefore suggest if any essential part of a data record is being modified, that the record get a new GUID (and with my limited understanding of taxonomic intricacies, I would really think that if you are changing the "concept"/understanding/meaning of a species record, then it should be given a new GUID.

So I think the answer to your question must be yes, we do have quite an interest in the topics you covered.

I hope this helps.

Kevin


PS I attempted to view your talk using the link you gave and I was not sure what to do next to see the talk itself.  I attempted to browse to "Recorded Presentation", but I needed a login, which I do not have???


From: tdwg-bounces at lists.tdwg.org [mailto:tdwg-bounces at lists.tdwg.org] On Behalf Of Peter DeVries
Sent: Friday, 23 January 2009 3:58 p.m.
To: tdwg at lists.tdwg.org; Taxacom at mailman.nhm.ku.edu
Cc: lyricb at iastate.edu; Young at entomology.wisc.edu; raffa at entomology.wisc.edu; sbcarrol at wisc.edu
Subject: [tdwg] ESA 2008 Talk Using Semantic Web technologies to tie together disparate data about species .. Unique URI for each taxon

I don't know how many of you are working with Semantic Web technologies, but I did a talk at the
2008 Meeting of the Entomological Society of America Meeting in Reno.

You can see it here : http://esa.confex.com/esa/2008/webprogram/Paper39190.html

One of the issues that I have been struggling with is the need for a unique identifier for a species
concept that stays the same despite changes in taxonomic hypotheses. When I brought this up
earlier some mentioned that many considered the taxonomic hypothesis to be the species concept.

I thought that this approach was wrong because it unnecessarily required me to convince one of my
collaborators to adopt my taxonomic hypothesis when it was clear we were talking about the same
species.

To me a species is a real thing to which different taxonomic hypothesis are applied.

With this in mind I implemented a system where each species has a unique URI which resolves according
to the recommendations of the linked data community.

You can see this service at:  http://species.geospecies.org/

Here is the page for Culicidae http://species.geospecies.org/families/Culicidae

Here is an example of the URI for Aedes vexans

http://species.geospecies.org/spec_concept_uuid/0fcb5b7e-bcfc-4b56-b565-e1e38768badd/

For a web browser this will resolve to the human readable page, for a semantic web crawler
it will respond with RDF data.

Here is journal reference marked up using the two major bibliography ontologies that includes
what species concepts are included in the article. This allows publications to retain links to
species concepts despite changes in nomenclature.

http://rdf.geospecies.org/refs/sucaet2008wbr/index.rdf

Here is an RDF file of species observation records (the web version is currently password protected)

http://rdf.geospecies.org/observations/index.rdf

I have chosen the UUID as the mechanism to make unique URI's not because of their inherent beauty,
but because the allow anyone on almost any platform to create a unique id without worrying about to
identical id's being created.

For those interested, I can send you a PDF of my talk slides and text.

I would also appreciate any feedback or suggestions :-)

- Pete

---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries at wisc.edu<mailto:pdevries at wisc.edu>
Insects of Wisconsin<http://insects.entomology.wisc.edu/>
Spiders of Wisconsin<http://spiders.entomology.wisc.edu/>
------------------------------------------------------------


________________________________
Please consider the environment before printing this email
Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.
The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz



More information about the Taxacom mailing list