Arthur Reutenauer wrote:

   This is exactly the problem that TeX's hyphenation algorithm was
developed for.  It's exactly as you write: you give a list of rules
describing where you can and you can't break words ("hyphenation
patterns") and TeX does the job of finding the "nicest" authorized break
for you.

   I'm responsible with Mojca Miklavec for maintaining the hyphenation
patterns in TeX Live; if you can describe the rules more precisely we
can add patterns for Lao, Thai and Khmer to the set of patterns we
already have (and it's already quite big, coming from several dozens of
contributors all over the world).  Mojca added patterns for all the
major languages of India last month but we have no languages from
South-East Asia yet. I've always understood the word-breaking rules were
very different from other languages but I suppose the same mechanism
could be adapted; you only need to bring the linguistic knowledge!

I agree with your analysis (and thought much the same), but
there is a complication : TeX breaks lines only at spaces unless
it hyphenates a word (default behaviour); what I understand from
Brian's original message (Brian : please correct me if I am wrong)
is that Lao breaks between character pairs rather than at spaces,
and that no hyphenation occurs.  Which made it a fascinating
challenge and well worthy of attention :-)

** Phil.


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Reply via email to