Am Fri, 11 Nov 2011 16:33:20 +0100 schrieb Zdenek Wagner: > I still do not understand the internal mechanism. I know how > punctuation is handled in French, the category of a few characters is > set to 13 and defined as some macros. But how can XeTeX regognize > whether the space token with category 10 has to be converted to a > nonbreakable space?
There was once a discussion about spaces on the xetex list starting here: http://tug.org/mailman/htdig/xetex/2009-March/012480.html I don't know if the code discussed there led to a package or found its way somehow in the format. I asked in this thread how spaces are handle and got this answer from Jonathan: >>> %% U+00A0 NO-BREAK SPACE; >>> %% Unicode char for ~. >>> \catcode`^^^^00a0=\active >>> \def^^^^00a0{\nobreakspace} > Are the definitions necessary? That means how does XeTex handle > normally e.g. U+00A0 NO-BREAK SPACE? Can there be a line break > before or after this input? XeTeX has no special built-in knowledge about U+00A0 or the various other Unicode space-like characters; it will simply "print" them in the current font. Which would be fine, except that some fonts fail to support them, in which case you'll get a .notdef glyph. :( Defining these in a font-independent way using TeX seems like a good idea in general; however, care may be needed to make them work correctly in all contexts, particularly when they occur in text that ends up going to the LaTeX .aux file, etc., or into PDF bookmarks. I haven't really looked into this, not being a serious LaTeX user, just wondering...... -- Ulrike Fischer -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex