[Taxacom] Language tags for scientific names

Andy Mabbett andy at pigsonthewing.org.uk
Sat Jun 28 11:07:10 CDT 2008


In message <486648E4.2020703 at earthlink.net>, Curtis Clark 
<jcclark-lists at earthlink.net> writes

>On 2008-06-27 18:18, Donald.Hobern at csiro.au wrote:
>> 3. Will other software (outside our community) be able to do something
>> sensible with what we put in the tag?
>
>Besides the translation issue raised above, I've encountered two other 
>real-life examples recently:
>
>1. There's a mini-controversy on Wikipedia (isn't there always?) 
>concerning an editor who language-tags scientific names as la. He may 
>have begun doing this when I pointed out that his autocorrection of 
>Cornus florida to Cornus Florida was an error. A number of other 
>editors have made the same point as people on the Taxacom thread, that 
>scientific names are better regarded as elements within the language of 
>the text that contains them, and are resistant to the idea of language 
>tagging. Some of the proposals made here would address that.

There is a case - a weak one - that "la" is the best
/currently available/ language tag for taxonomic names; and the best 
fall-back if we are to use a multi-part pattern such as la-x-taxon (i.e. 
while not Latin, they are closer to Latin than English, or German, or 
any other language [*])

I'd prefer to avoid that argument by having a more specific "pseudo-" 
language code at the highest level ;-)

>2. I have been working with a student who uses a screen reader, as part 
>of our accessible technology program. As has been pointed out, screen 
>readers shift voices and pronunciations when they encounter language 
>tags. This student is not a biologist, and doesn't often encounter 
>scientific names, but a simple thought experiment shows the issues.
 >
>Consider the plant species Hebe trisepala. Pronounced by a screen 
>reader as English, it might come across as HEEB trice-PAY-luh. Tagged 
>as Latin, and assuming the screen reader had the ability to read Latin 
>(probably not common), it would be HAY-bay tree-SEH-pah-lah. Neither of 
>those is a good representation of the pronunciation in either American 
>or Commonwealth English.

You may be right, but that will depend entirely on the software in 
question, and how it is configured.

>A set of requirements for tagging scientific names needs to address 
>these real-world examples.

More so the former example; once an adequate and appropriate tag is 
available, it is for the purveyors of software to provide dictionaries 
and "voices" to make sense of them; that's not something we're likely to 
achieve here.

At some point in the future, your student's software, reading a page 
otherwise in English, might change voice if it encounters some Latin 
text ("caveat emptor!"), but to keep the same voice if it encounters a 
taxon, while using taxon-specific rules for pronunciation.


[*] As a group, that is; there are individual exceptions, which are 
valid in Latin, or English, or other languages, or which have arguably 
become retrospectively added to a language, such as the many garden 
plants (like "Acer palmatum") lacking vernacular names, and so commonly 
referred to by their scientific names, in everyday English, by people 
who wouldn't recognise them as biological taxa.

-- 
Andy Mabbett




More information about the Taxacom mailing list