[Taxacom] Data query
deepreef at bishopmuseum.org
Wed Jun 26 06:23:25 CDT 2013
Thanks, Tony --
[Note to non-data-nerds -- you might want to delete now.]
> One thing relevant to your explanation below: the species epithet is
> as a unique entity in zoology, yes, but in botany (most likely also in
> bacteriology, not sure) it is the original combination which is unique
> species plus its original genus placement, not the epithet alone.
Yes, but from the TNU perspective, this difference turns out to be
irrelevant for data modeling purposes. A subset of all TNUs represent
Nomenclatural Acts. Because the Codes are different, there are different
rules to determine which TNUs represent Code-governed nomenclatural acts.
Under the botanical Code, TNUs representing the first combination of an
epithet and a genus are included within the subset of TNUs that are
nomenclatural acts; whereas under the zoological Code, they are not. This
is just one of many differences in the application of nomenclatural acts to
TNUs; but the important point is that the TNUs themselves are the same
regardless. Several years and many NOMINA meetings with zoology nerds,
botanical nerds, a few bacterial nerds, and yes, even one particularly
outspoken mycological nerd -- have led to a better understanding of how to
model data from a TNU perspective in a way that is Code-agnostic (or,
rather, accommodates the needs of all the Codes). One nice thing about this
approach is that it allows the same TNU to represent different nomenclatural
acts under different Codes in the case of ambiregnal names.
> If it is moved
> to a different genus a new combination is established with a new
> (unlike in zoology), with the old one in brackets. In other words now we
> a new "thingy" (needing a new ID) which has a relationship to the original
> "thingy" but is not the same.
No new ID necessary. All of it happens via TNUs, and each TNU has its own
unique identifier. All that differs is which TNUs represent Code-governed
acts under which Codes. The issue with comb. nov. under ICNafp is only one
example. In other words, all of the "thingies" are TNUs. The issue is
about which subset of those Thingies carry Code-goverened nomenclatural
acts, and which do not. The rules differ between the Codes, but the
fundamental TNUs remain the same in any case.
> Forgive me if I have this wrong but I think you
> may need to encode slightly different rules for botanical and zoological
> in GNUB at least (not required in ZooBank for obvious reasons).
Yes, absolutely! But the new GNUB model (which was hammered out over
several NOMINA meetings) puts a nomenclatural layer on top of the TNUs. We
call these this layer the "Nomenclatural Event" layer, and as I described
above, while each Code may have its own unique set of Nomenclatural Events
specific to each Code, this layer of Nomenclatural Events applies to the
same underlying TNU structure for all Codes. Nothing in my previous email
touched on that layer -- that would require a MUCH longer email to explain.
The way it's implemented, you can define a virtually unlimited number of
rules through an unlimited number of different nomenclatural codes. In
other words, it's open-ended in its design, so if a new Code comes along, or
if an existing Code changes, you just need to define new Nomenclatural
Events (i.e., no new tables or fields required; only new records in existing
tables and fields).
> I imagine
> those on the GNUB-constructing board / advisory panel / whatever from the
> botanical sphere will be across this, but it is not apparent from my
> what you have written below (I think).
What I wrote in the previous post is a data model for TNUs. As I said, the
TNUs are the same across Codes. The differences between the Codes are
applied at the Nomenclatural Event layer. Simple algorithms take care of
formatting the bits together in a Code-appropriate way when presenting to a
human. For example, names seen through the zoological Nomenclatural Event
layer are formatted as "Aus bus (Linnaeus, 1758)", whereas an analogous name
under ICNafp governance would be rendered as "Aus bus (L.) Jones". The
underlying data are the same in both cases -- it's just an issue for the
algorithm operating at the presentation layer.
> Best - Tony (only mildly confused at this point, may get worse).
Don't feel bad -- it took me the better part of my PhD years to get my head
around it enough to come up with the Taxonomer data model, and then it took
a number of years and NOMINA meetings to generalize it across Codes. The
good news is that, the more we implement it with real world data, the more
confident I am that we're on the right track.
More information about the Taxacom