On 7 May 2015 at 02:07, Ross Moore wrote:
> Hi David,
>
> ..
>
> No disagreement to this.
>
>
OK:-)
>
> In the current versions d835dc00 is two characters in luatex
> and one character in xetex
> as the implementation detail that xetex's underlying storage is mostly
> UTF-16 is exp
Hi David,
On 07/05/2015, at 9:26 AM, David Carlisle wrote:
>> The character itself, as bytes that is, is not wrong and users should be
>> able to create these.
>> But preferably through macros that ensure that they come correctly paired.
>
> placing two character tokens representing a surrogate
> The character itself, as bytes that is, is not wrong and users should be able
> to create these.
> But preferably through macros that ensure that they come correctly paired.
placing two character tokens representing a surrogate pair should not
though magically turn itself
into a single characte
Hi Arthur,
On 07/05/2015, at 8:04, Arthur Reutenauer
wrote:
> While working on these bugs, we also discussed how surrogate
> characters were handled in XeTeX. Surrogate characters are the 2048
> code points that are used in UTF-16 to encode characters with code
> points above 65536: a pair of
On 6 May 2015 at 23:04, Arthur Reutenauer
wrote:
> While working on these bugs, we also discussed how surrogate
> characters were handled in XeTeX. Surrogate characters are the 2048
> code points that are used in UTF-16 to encode characters with code
> points above 65536: a pair of them makes u
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
points above 65536: a pair of them makes up one Unicode character;
however they're not meant to be u
On 4 May 2015 at 16:27, Jonathan Kew wrote:
> ...
>
> A fix for this bug, so that \string generates single Unicode characters
> even for values above U+, is currently on the utf16-issues branch in
> the XeTeX repository on sourceforge.[1]
>
> A bug with characters above U+ within \scantok
On 23/4/15 20:59, David Carlisle wrote:
I can confirm that \string does convert character tokens
to two tokens giving the UTF-16 representation.
With the attached file luatex produces
90,33
34,33
233,33
233,33
65530,33
65537,33
65537,33
which is in each case the unicode value of the character