[Taxacom] taxonomic names databases

Nico Franz nico.franz at asu.edu
Tue Sep 6 10:21:33 CDT 2016


Thank you, Rod.

   The poor communication is on me, not you. Also, I am taking advantage of
this being a forum where one can perhaps be a bit jarring to test out ideas
and more forward based on reactions.

   My ultimate purpose is not to single out persons. It's just that lately
I am fighting inside myself the notion that certain aspects of
"biodiversity informatics" do not serve the individual, career-building
taxonomist as well as they can and should. For instance, the notion that
taxonomy and phylogenetics are "separate things". Or that we need a single
"synthesis" of our current biodiversity knowledge in order to navigate
through data. I think both notions are false, and also unnecessarily
detrimental to the taxonomic agenda. I think science funding panels, in
some of which I personally partake on occasion, are a bit drunk on the
whole notion of "synthesis". More so than synthesis, we need provenance of
conflicting views, and assessment tools for robustness on inferences given
these conflicting views. But synthesizing conflict away is short-sided,
instead we need to embrace it. If "users demand" one tree, well then we can
argue why that demand is not a sound reflection of our science and
scientific business plan, and provide other means for users to relevantly
meet goals.

   I also seem to observe a bit of an unfortunate separation between the
TDWG community and, say, the folks who more frequently attend Evolution
Meetings. And to me, not enough biodiversity informatics services seem to
put individual taxonomists at the very top of their hierarchy of
contributors to serve. (certain services clearly do, ScratchPads, Pensoft,
name/nomenclatural services, etc.) Maybe it's just that my feelings are
hurt, but I also suspect there are things to learn here, more soberly.

   Your summary is very reasonable, thanks. Some clarifications.

   With "deflationary" I mean: saying that a certain practice or technology
is not that meaningful or impactful. "We are just organizing things." "It
is just a tool for navigation". "We are just synthesizing the data that are
currently out there; we take what people give us". "We are just doing with
the users want and need". "Classification does not matter very much".

   I am trying to causally connect this kind of thinking to other design
features of aggregating biodiversity data services. And ultimately to a
common but perhaps unfair or mistaken perception regarding trust in data
provided by these services.

   Further clarifications.

   No, GBIF does not build one classification. GBIF (and many other
services; I don't mean to be so specific) continues to build a chain of
multiple versions, except the chain's versions are not well connected
semantically. Each version is being used at the time to do other science
with, of course.

   Clearly, in providing "the synthesis", GBIF (and others) does create
some taxonomies whose structure is best and only accreditable to these
"sources". An example that I am deeply familiar with is this:
https://tree.opentreeoflife.org/taxonomy/browse?id=211889

   After working up about 20 recent belid trees and classifications in the
Euler/X tool, and aligning them all, this is the one tree that stomped me.
In the primary literature, there are at least two largely
non-intersecting/self-propagating lineages of belid tree making (bit of a
east versus west story). This one is/was different, and not (by me)
attributable (even in parts) to any actual author. That is cherry picking,
but I claim that it is also a design feature.

   So, yes to your summary of my current thoughts. In biodiversity
informatics we (often, by no means always) seem to have bought into certain
design rationales and paradigms that affect trust in data. This design
package tends to include a purported need for synthesis, which is a
misnomer when viewed over time and often also tends to counter-act good
provenance tracking between versions. And the package tends to eliminate
exposure of conflict and uncertainty which in turn are pillars of the
taxonomic enterprise. And it comes packaged with a notion that these are
"just technical, operational needs" to meet the demands of users at scale.

   Maybe this package needs to be looked into more. Personally, I want us
taxonomist to be aware, and to play the best role we can in getting good
services. I am trying to point in all directions.

Best, Nico


On Tue, Sep 6, 2016 at 4:25 AM, Roderic Page <Roderic.Page at glasgow.ac.uk>
wrote:

> Hi Nico,
>
> So, I’m try to parse this paragraph into something I can act upon. Phrases
> like "deflationary stance”, "exclusion of that heterogenous community” ,
> and "honor the notion that expertise is personalized” are *cough*, perhaps
> less than crystal clear (or I’m being lazy).
>
> GBIF consumes a bunch of mutually inconsistent classifications and/or
> lists of names, these classifications and lists are rarely connected to
> evidence (for example, few cite the taxonomic literature supporting each
> name, hardly any provide something useful such as a DOI for an article).
>
> GBIF then applies a bunch of techniques to try and synthesise a single
> classification from this input, so that users (the majority of whom don’t
> care at all about taxonomic niceties) can navigate the data. These
> techniques have author(s) (mostly Markus Döring at GBIF), the code is open,
> and it’s development is public for all to see. It is, however, often hard
> for an outsider to work out how conflicts are resolved, or how some obvious
> errors have come about (e.g., http://dev.gbif.org/issues/browse/PF-2600 ).
>
> If I understand your concerns correctly, they are:
>
> 1. GBIF builds a classification that may create new relationships not
> explicitly mentioned in the taxonomic literature ("novel theory making”).
> If GBIF were to claim that it simply takes what people give it and the
> synthesis doesn’t, of itself, create anything new, this would be a
> "deflationary stance”. To my knowledge GBIF doesn’t claim this, indeed, one
> of the goals of synthesis is to generate something more than a simple
> aggregation of things.
>
> 2. GBIF builds ONE classification (albeit one that evolves over time). Not
> everybody may agree with that classification (the "heterogenous
> community”). Note that GBIF links to all the input classifications, so you
> can still browse them. But yes, there is one “GBIF” viewpoint.
>
> 3. It is hard to go from the GBIF classification to the expertise that
> generated the names, lists, and classifications that are ultimately
> incorporated into that classification. If it were possible to do this, that
> could increase the level of trust people might have, and the willingness of
> experts to engage with the process of assembling the GBIF classification.
>
> Is this a reasonable summary?
>
> Regards,
>
> Rod
>
>
> On 2 Sep 2016, at 15:48, Nico Franz <nico.franz at asu.edu> wrote:
>
>   Of course not all will agree with this view. But I think it is a
> plausible position *for a taxonomist* to adopt. And that may mean that,
> regardless of how certain aggregators prefer to perceive their activities
> as merely this or that, for a good section of the expert community there
> *is* a perception of novel theory making, and of novel theory making under
> a design paradigm that can work to the exclusion of that heterogenous
> community. A deflationary stance is not an effective way to work against
> that perception. Acknowledgement does not negate the great value of
> syntheses to some; instead I think it ultimately helps bring contributors,
> users, and quality/trust issues closer together.
>
>
> ---------------------------------------------------------
> Roderic Page
> Professor of Taxonomy
> Institute of Biodiversity, Animal Health and Comparative Medicine
> College of Medical, Veterinary and Life Sciences
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QQ, UK
>
> Email:  Roderic.Page at glasgow.ac.uk
> Tel:  +44 141 330 4778
> Skype:  rdmpage
> Facebook:  http://www.facebook.com/rdmpage
> LinkedIn:  http://uk.linkedin.com/in/rdmpage
> Twitter:  http://twitter.com/rdmpage
> Blog:  http://iphylo.blogspot.com
> ORCID:  http://orcid.org/0000-0002-7101-9767
> Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> ResearchGate https://www.researchgate.net/profile/Roderic_Page
>
>
>



More information about the Taxacom mailing list