Just a quick heads up:

In general, locale codes (the Lang= field of .confs) can have subtags that indicate region, script, etc. Ideally these should be dealt with in some fashion by front ends since they identify important distinctions (in the eyes of the module maker or publisher at least).

When unknown subtags are encountered, it's probably best to recursively fall back to the tag minus its right-most subtag. For example, if 'en-Latn-US' is unknown, fall back to 'en-Latn'. If that is unknown, fall back to 'en'. (Hopefully nearly all language subtags are known.)

We should handle this in the library, but currently don't. :(


As a specific case in point:
We now have two Urdu translations. They're the same translation and differ in their script (one is Arabic, the other Devanagari). Their language codes (as of the 1.2.1 release just made, which corrected the code for the Devanagari version) are: ur (Urdu in Arabic script--the usual script for Urdu) and ur-Deva (Urdu in Devanagari script).

Possible behaviors are to categorize the ur-Deva module as belonging to an unknown language (bad), to fall back and categorize it as simply Urdu (better, but certainly confusing if the language name is written in Arabic and the module is itself written in Devanagari), or to categorize it separately as Urdu written in Devanagari (best).

For implementers who localize the language name, Urdu written in Arabic is written "اردو". Urdu written in Devanagari is written "उर्दू".

--Chris

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to