Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

Jonathan Kew Sun, 21 Feb 2021 15:39:24 -0800

On 21/02/2021 22:55, Ross Moore wrote:

The file reading has failed before any tex accessible processing hashappened (see the ebcdic example in the TeXBook)
OK.
But that’s changing the meaning of bit-order, yes?
Surely we can be past that.

No, it's not about bit-order; it's about changing the mapping of codeunits in the external file to character codes in TeX's internal(ASCII-based) code.


\danger \TeX\ always uses the internal character code of Appendix~C
for the standard ASCII characters,
regardless of what external coding scheme actually appears in the files
being read.  Thus, |b| is 98 inside of \TeX\ even when your computer
normally deals with ^{EBCDIC} or some other non-ASCII scheme; the \TeX\
software has been set up to convert text files to internal code, and to
convert back to the external code when writing text files.

the file encoding is failing at the "convert text files to internalcode" stage which is before the line buffer of characters is consultedto produce the stream of tokens based on catcodes.


Yes, OK; so my model isn’t up to it, as Bruno said.
  … And Jonathan has commented.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.

Well.... as long as the Latin-1 accented characters are only incomments, it arguably doesn't "really" matter; xetex logs a warning thatit can't interpret them, but if you know that part of the line is goingto be ignored anyway, you can ignore the warning.

(pdfTeX doesn't care because it simply reads the bytes from the file;any interpretation of bytes as one encoding or another is handled at theTeX macro level.)

JK

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

Reply via email to