Last year I asked about the possibility of adding \Uchar copied from luatex.
http://tug.org/pipermail/xetex/2014-May/025260.html Bruno suggested a possible implementation, and I finally got round to trying that adjusted for the sources as in the texlive 2015 pretest tree (diff attached) This seems to work fine for characters below "FFFF but fails for non BMP characters above that. See the attached xetexuchar.tex file and the log produced by luatex and (patched) xetex. It just uses the same print_char routine as \string so I thought I'd test that. See the file nonbmp.tex (which can be used with a non-patched xetex) As can be seen with the attached logs this works with luatex with \string on U+1D538 producing a single character, but with xetex it produces two (presumably the UTF-16 surrogate pair, although I didn't check that). Is my reading of this file correct and \string and meaning are turning U+1D538 into two characters, and if so does anyone have a suggestion of the best place this should be attacked in the source? David
*** xetex.web~ 2015-04-16 20:53:45.000000000 +0100 --- xetex.web 2015-04-23 10:54:21.677243400 +0100 *************** *** 10478,10484 **** @d left_margin_kern_code=11 @d right_margin_kern_code=12 ! @d etex_convert_codes=right_margin_kern_code+1 {end of \eTeX's command codes} @d job_name_code=etex_convert_codes {command code for \.{\\jobname}} @<Put each...@>= --- 10478,10486 ---- @d left_margin_kern_code=11 @d right_margin_kern_code=12 ! @d XeTeX_Uchar_code = 13 ! ! @d etex_convert_codes=XeTeX_Uchar_code+1 {end of \eTeX's command codes} @d job_name_code=etex_convert_codes {command code for \.{\\jobname}} @<Put each...@>= *************** *** 10499,10504 **** --- 10501,10509 ---- primitive("rightmarginkern",convert,right_margin_kern_code);@/ @!@:right_margin_kern_}{\.{\\rightmarginkern} primitive@> + primitive("Uchar",convert,XeTeX_Uchar_code);@/ + @!@:XeTeX_Uchar_}{\.{\\Uchar} primitive@> + @ @<Cases of |print_cmd_chr|...@>= convert: case chr_code of number_code: print_esc("number"); *************** *** 10549,10554 **** --- 10554,10560 ---- scanner_status:=normal; get_token; scanner_status:=save_scanner_status; end; font_name_code: scan_font_ident; + XeTeX_Uchar_code: scan_int; @/@<Cases of `Scan the argument for command |c|'@>@/ job_name_code: if job_name=0 then open_log_file; end {there are no other cases} *************** *** 10576,10581 **** --- 10582,10588 ---- print("pt"); end; end; + XeTeX_Uchar_code: print_char(cur_val); @/@<Cases of `Print the result of command |c|'@>@/ job_name_code: print_file_name(job_name, 0, 0); end {there are no other cases} *************** *** 30000,30005 **** --- 30007,30014 ---- XeTeX_selector_name_code: print_esc("XeTeXselectorname"); XeTeX_glyph_name_code: print_esc("XeTeXglyphname"); + XeTeX_Uchar_code: print_esc("Uchar"); + @ @<Cases of `Scan the argument for command |c|'@>= eTeX_revision_code: do_nothing; pdf_strcmp_code:
xetexuchar.tex
Description: TeX document
xetexuchar.xetex.log
Description: Binary data
xetexuchar.luatex.log
Description: Binary data
nonbmp.tex
Description: TeX document
nonbmp.xetex.log
Description: Binary data
nonbmp.luatex.log
Description: Binary data
-------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex