[Taxacom] Language tags for scientific names

Gregor Hagedorn g.m.hagedorn at gmail.com
Fri Jun 27 20:49:46 CDT 2008

Donald writes:
> 1. What do we (as the interested community) really want to represent by
> such codes?  When we want to give additional information about a
> particular scientific name (including e.g. code), why don't we just use
> proprietary tags (like the TDWG RDF TaxonName properties)?

a) Because current technology of RDF does not seem to support markup
of text in publications (xhtml, PDF, etc.) very well. This issue is
not primarily about cases where the scientific name is an atomized
piece of data.

b) RDF is limited to RDF-based applications like XMP, but cannot be
extended to more general xml-schema based data exchange, including
EML, SDD or TaxonX.

Please correct me if am wrong, I may not be up to date!

> 2. Is this compatible with the intended use of the language tag?  If
> something is a proper name with no translation into different languages,
> what would ISO expect?  Is the language code in any way appropriate in
> such a case?

zxx is a standard ISO and IETF code for this case, nothing proposed to
introduce here.

zxx is especially important for binary media objects: many images are
zxx, but some may be language specific.

About processing: since xml:lang is part of the xml specification, any
xml processor is known how to behave on a low level.

> I believe it is an abuse of the language code to use it to identify the
> nomenclatural code, and something which may alienate non-taxonomic
> users.  A single code, (null, "zxx", "tax" or something) would encourage
> consistent interpretation better, but I still think we are
> misunderstanding the point of the language tag.  Surely its primary role
> is in situations in which there may be different versions of some
> content in different languages and software may choose the most
> appropriate for a given user.  This is not the case here and I suspect
> the ISO recommendation would be to use "zxx" or nothing at all.

I have no problem if a consensus emerges that the nomenclatural codes
should not be considered here.

I disagree with your last point. Language tags are important for many
purposes, including presentation or screen or for readers, for spell
checking of documents with mixed language use (most people have to use
several languages in parallel rather than choosing one). When using
scientific organisms names in Chinese, arab, or hebrew texts,
appropriate language tags would inform software on script, reading
direction, etc.


Examples for a free form text description of a species as it might be published:

<p xml:lang="la">conidia singula, terminalia ... ex <span
xml:lang="zxx-taxon">Rhododendron maximum</span> in Germania</p>
<p xml:lang="de">Konidien einzeln, endständig, isoliert aus <span
xml:lang="zxx-taxon">Rhododendron maximum</span> in Deutschland</p>


