[Taxacom] Language tags for scientific names
Andy Mabbett
andy at pigsonthewing.org.uk
Sat Jun 28 11:07:10 CDT 2008
In message <486648E4.2020703 at earthlink.net>, Curtis Clark
<jcclark-lists at earthlink.net> writes
>On 2008-06-27 18:18, Donald.Hobern at csiro.au wrote:
>> 3. Will other software (outside our community) be able to do something
>> sensible with what we put in the tag?
>
>Besides the translation issue raised above, I've encountered two other
>real-life examples recently:
>
>1. There's a mini-controversy on Wikipedia (isn't there always?)
>concerning an editor who language-tags scientific names as la. He may
>have begun doing this when I pointed out that his autocorrection of
>Cornus florida to Cornus Florida was an error. A number of other
>editors have made the same point as people on the Taxacom thread, that
>scientific names are better regarded as elements within the language of
>the text that contains them, and are resistant to the idea of language
>tagging. Some of the proposals made here would address that.
There is a case - a weak one - that "la" is the best
/currently available/ language tag for taxonomic names; and the best
fall-back if we are to use a multi-part pattern such as la-x-taxon (i.e.
while not Latin, they are closer to Latin than English, or German, or
any other language [*])
I'd prefer to avoid that argument by having a more specific "pseudo-"
language code at the highest level ;-)
>2. I have been working with a student who uses a screen reader, as part
>of our accessible technology program. As has been pointed out, screen
>readers shift voices and pronunciations when they encounter language
>tags. This student is not a biologist, and doesn't often encounter
>scientific names, but a simple thought experiment shows the issues.
>
>Consider the plant species Hebe trisepala. Pronounced by a screen
>reader as English, it might come across as HEEB trice-PAY-luh. Tagged
>as Latin, and assuming the screen reader had the ability to read Latin
>(probably not common), it would be HAY-bay tree-SEH-pah-lah. Neither of
>those is a good representation of the pronunciation in either American
>or Commonwealth English.
You may be right, but that will depend entirely on the software in
question, and how it is configured.
>A set of requirements for tagging scientific names needs to address
>these real-world examples.
More so the former example; once an adequate and appropriate tag is
available, it is for the purveyors of software to provide dictionaries
and "voices" to make sense of them; that's not something we're likely to
achieve here.
At some point in the future, your student's software, reading a page
otherwise in English, might change voice if it encounters some Latin
text ("caveat emptor!"), but to keep the same voice if it encounters a
taxon, while using taxon-specific rules for pronunciation.
[*] As a group, that is; there are individual exceptions, which are
valid in Latin, or English, or other languages, or which have arguably
become retrospectively added to a language, such as the many garden
plants (like "Acer palmatum") lacking vernacular names, and so commonly
referred to by their scientific names, in everyday English, by people
who wouldn't recognise them as biological taxa.
--
Andy Mabbett
More information about the Taxacom
mailing list