[Taxacom] Language tags for scientific names
andy at pigsonthewing.org.uk
Fri Jun 27 18:46:58 CDT 2008
<5ebbead70806271530o4f73358do204a7179445ec51e at mail.gmail.com>, Gregor
Hagedorn <g.m.hagedorn at gmail.com> writes
>The point of the proposal to use xml:lang is to offer a general way to
>denote certain passages of free-form text or structured text elements
>as general as possible.
xml:lang is one use for an IETF-language tag; but the same tags can also
be used in other cases, outside XML, such as the "lang" attribute in
>It is possible to alternatively agree on a microformat (using the class
>attribute) for xhtml,
For clarity, microformats can be used in HTML, not just XHTML.
>to quote from rfc 4646:
> de-CH-1901 (German as used in Switzerland using the 1901 variant
> sl-IT-nedis (Slovenian as used in Italy, Nadiza dialect)
The difference between those two language tags and recent proposals for
including ICBN and the like, is that they "degrade gracefully", though a
hierarchy. For instance, if a parser does not understand "de-CH-1901",
it will fall back to using "de-CH", and it if doesn't understand that,
This will work for, say, tx-QQ-ICBN ("tx" instead of "tax"; whatever
"QQ" might be), and for zxx-TX-ICBN, but only to a very limited degree
for zxx-x-ICBN/ zxx-x-TAX or zxx-ICBN/ zxx-TAX.
>It would be a community decision to use language tags as one of several
Indeed - isn't this the community debate about which tag(s) should be
used? Though the IETF-languages community is also part for the wider
community, which will make the final choice.
>Using the private use areas (everything after an "-x-" subtag) we could
>start without even registering. But it would be painful if everyone
>uses different private notations.
Quite - that's exactly why I decided against using -x-, after due
deliberation. I think taxonomic names are sued too widely, and by too
many publishers, to take that road.
Perhaps we are now ready to start to draw up a list of requirements,
which will no doubt need further tweaking:
1 Valid according to RFC 4646
2 Acceptable to IETF-languages community
3 hierarchical (there must be a parent, generic "taxonomy" level
above any levels for specific codes)
4 sub-tags for individual codes (if deemed appropriate)
5 using -x- only as a last resort
My personal view is that (4) is unnecessary; and unlikely to succeed in
meeting (2), but I have worded this list as neutrally as possible, Would
anyone like to add to, or change, it?
[Apologies if I've misconstrued any of your points; it would be easier
to avoid dosing so if you would kindly quote some of the post to which
you are replying, to give context. Thank you.]
More information about the Taxacom