[XeTeX] [OT] character encoding ideas [was: Re: Strange hyphenation with polyglossia in French]

Jonathan Kew Thu, 21 Oct 2010 04:04:13 -0700

On 21 Oct 2010, at 11:29, Philip Taylor (Webmaster, Ret'd) wrote:
> 
> As to why different planes for different languages (or dialects),
> there are many reasons, of which (for me) the two most important
> are : (1) all characters required for a single language would form
> a contiguous cluster within the character set; and (2) any text encoded
> using this system would automatically carry with it implicit <language>
> (or <language:dialect>) tags for every stretch of text, no matter
> how long or how short.


Sorry, Phil, but I don't think such a scheme would be even remotely workable. 
Aside from the sheer number of such "planes" that would have to be defined and 
supported (have you browsed http://www.ethnologue.com/ lately? And that's just 
for the living languages...), it would be utterly impossible to reach any 
consensus regarding where the dividing lines should be drawn, and we'd have a 
massive increase in acrimonious political debates regarding linguistic and 
cultural identity.

Unicode may be far from perfect, representing as it does a compromise between 
many often-conflicting requirements, but it's a more reasonable approach than 
that.

JK




--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] [OT] character encoding ideas [was: Re: Strange hyphenation with polyglossia in French]

Reply via email to