On 8/5/15 12:54, David Carlisle wrote:
On 8 May 2015 at 11:45, Adam Twardoch (List) <list.a...@twardoch.com> wrote:
In modern text processing (Unicode+OpenType), a text run is a series of
characters with the same formatting (font, size, color etc.), directionality
(ltr, rtl) and script (writing system such as Latin, Greek, Arabic or Gujarati).
er, yes, quite:-)
so my question is why did this X trigger the 255 boundary for end of
text run processing.
Well... it's a long time since I touched any of this, but let's see if
we can figure it out. I don't suppose it's clearly spelled out anywhere
just what such a "run" is, for interchartoks purposes.
From looking at the code, it appears that xetex sets the "boundary"
state at the beginning of an interchartoks-inserted token list; and if
-- as in your example -- that token list doesn't contain any characters
that cause the current state to change, then the following character
will be treated as adjacent to that boundary. Which is why it then sees
the "255 \Xclass" on (re-)encountering the X after processing the "0
\Xclass" insertion.
So the lesson seems to be that if you're going to provide interchartoks
for a <boundary><something> transition, whatever you insert had better
cause a change in the current class -- i.e. insert an actual character
of some kind -- otherwise you're headed for a loop. If what you want to
do here doesn't involve inserting text, then you probably want to
locally disable interchartoks processing. E.g. if you modify your
example to say something like
\def\zza{\begingroup\XeTeXinterchartokenstate=0 \futurelet\tmpa\zzza}
\def\zzza#1{#1\show\tmpa\endgroup}
then you'll get \zza executed once, showing \tmpa as expected, but your
\zzb never gets hit.
JK
--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex