On 28/11/2019 00:16, Ross Moore wrote:
If by ignoring you mean removing the character entirely, then that is surely
not best at all.
Most N Class (Normal) characters would be simply of the default \mathord
class.
That is already the case: it's where IniTeX starts off, chars are
mathord. So 'nothing to do here'. Also note that some of this
information is already set from the main Unicode file: it tells us which
chars are letters.
I’d expect others to be mapped instead into a macro that corresponds to
something that TeX does support.
e.g.
space characters for thinspace, 2-em space, etc. in U+2000 – U+200A
can expand into things like: \, \; \> \quad \qquad etc. ( even to
constructions like \mskip1mu )
That's not a generic IniTeX thing, I'm afraid. The Unicode data loaders
are explicitly about setting up the basic data in Unicode TeX engines
that's held in (primitive) tables. Creating macros is the job of the
'rest' of the format. Here, presumably you are thinking of making chars
math-active: that's well out-of-scope for the loader.
After all, this is essentially what happens when pdfTeX reads raw Unicode input.
pdfTeX reads bytes, there's not really much comparison. In IniTeX mode,
there is not much happening with UTF-8 and pdfTeX: perhaps you are
thinking of with LaTeX?
Joseph