On 8/5/15 12:54, David Carlisle wrote:
On 8 May 2015 at 11:45, Adam Twardoch (List) <list.a...@twardoch.com> wrote:
In modern text processing (Unicode+OpenType), a text run is a series of 
characters with the same formatting (font, size, color etc.), directionality 
(ltr, rtl) and script (writing system such as Latin, Greek, Arabic or Gujarati).


er, yes, quite:-)

so my question is why did this X trigger the 255 boundary for end of
text run processing.


Well... it's a long time since I touched any of this, but let's see if we can figure it out. I don't suppose it's clearly spelled out anywhere just what such a "run" is, for interchartoks purposes.

From looking at the code, it appears that xetex sets the "boundary" state at the beginning of an interchartoks-inserted token list; and if -- as in your example -- that token list doesn't contain any characters that cause the current state to change, then the following character will be treated as adjacent to that boundary. Which is why it then sees the "255 \Xclass" on (re-)encountering the X after processing the "0 \Xclass" insertion.

So the lesson seems to be that if you're going to provide interchartoks for a <boundary><something> transition, whatever you insert had better cause a change in the current class -- i.e. insert an actual character of some kind -- otherwise you're headed for a loop. If what you want to do here doesn't involve inserting text, then you probably want to locally disable interchartoks processing. E.g. if you modify your example to say something like

  \def\zza{\begingroup\XeTeXinterchartokenstate=0 \futurelet\tmpa\zzza}
  \def\zzza#1{#1\show\tmpa\endgroup}

then you'll get \zza executed once, showing \tmpa as expected, but your \zzb never gets hit.

JK



--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Reply via email to