On 21/02/2021 22:55, Ross Moore wrote:
The file reading has failed  before any tex accessible processing has happened (see the ebcdic example in the TeXBook)

OK.
But that’s changing the meaning of bit-order, yes?
Surely we can be past that.

No, it's not about bit-order; it's about changing the mapping of code units in the external file to character codes in TeX's internal (ASCII-based) code.




\danger \TeX\ always uses the internal character code of Appendix~C
for the standard ASCII characters,
regardless of what external coding scheme actually appears in the files
being read.  Thus, |b| is 98 inside of \TeX\ even when your computer
normally deals with ^{EBCDIC} or some other non-ASCII scheme; the \TeX\
software has been set up to convert text files to internal code, and to
convert back to the external code when writing text files.


the file encoding is failing at the  "convert text files to internal code" stage which is before the line buffer of characters is consulted to produce the stream of tokens based on catcodes.

Yes, OK; so my model isn’t up to it, as Bruno said.
  … And Jonathan has commented.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.


Well.... as long as the Latin-1 accented characters are only in comments, it arguably doesn't "really" matter; xetex logs a warning that it can't interpret them, but if you know that part of the line is going to be ignored anyway, you can ignore the warning.

(pdfTeX doesn't care because it simply reads the bytes from the file; any interpretation of bytes as one encoding or another is handled at the TeX macro level.)

JK

Reply via email to