Last year I asked about the possibility of adding \Uchar copied from luatex.

http://tug.org/pipermail/xetex/2014-May/025260.html

Bruno suggested a possible implementation, and I finally got round to
trying that
adjusted for the sources as in the texlive 2015 pretest tree (diff attached)

This seems to work fine for characters below "FFFF
but fails for non BMP characters above that.

See the attached xetexuchar.tex file and the log produced by
luatex and (patched) xetex.

It just uses the same print_char routine as \string so I thought I'd test
that.
See the file nonbmp.tex (which can be used with a non-patched xetex)

As can be seen with the attached logs this works with luatex with
\string on U+1D538 producing a single character, but with xetex it produces
two (presumably the UTF-16 surrogate pair, although I didn't check that).

Is my reading of this file correct and \string and meaning are turning
U+1D538  into two characters, and if so does anyone have a suggestion
of the best place this should be attacked in the source?


David
*** xetex.web~  2015-04-16 20:53:45.000000000 +0100
--- xetex.web   2015-04-23 10:54:21.677243400 +0100
***************
*** 10478,10484 ****
  @d left_margin_kern_code=11
  @d right_margin_kern_code=12
  
! @d etex_convert_codes=right_margin_kern_code+1 {end of \eTeX's command codes}
  @d job_name_code=etex_convert_codes {command code for \.{\\jobname}}
  
  @<Put each...@>=
--- 10478,10486 ----
  @d left_margin_kern_code=11
  @d right_margin_kern_code=12
  
! @d XeTeX_Uchar_code = 13
! 
! @d etex_convert_codes=XeTeX_Uchar_code+1 {end of \eTeX's command codes}
  @d job_name_code=etex_convert_codes {command code for \.{\\jobname}}
  
  @<Put each...@>=
***************
*** 10499,10504 ****
--- 10501,10509 ----
  primitive("rightmarginkern",convert,right_margin_kern_code);@/
  @!@:right_margin_kern_}{\.{\\rightmarginkern} primitive@>
  
+ primitive("Uchar",convert,XeTeX_Uchar_code);@/
+ @!@:XeTeX_Uchar_}{\.{\\Uchar} primitive@>
+ 
  @ @<Cases of |print_cmd_chr|...@>=
  convert: case chr_code of
    number_code: print_esc("number");
***************
*** 10549,10554 ****
--- 10554,10560 ----
    scanner_status:=normal; get_token; scanner_status:=save_scanner_status;
    end;
  font_name_code: scan_font_ident;
+ XeTeX_Uchar_code: scan_int;
  @/@<Cases of `Scan the argument for command |c|'@>@/
  job_name_code: if job_name=0 then open_log_file;
  end {there are no other cases}
***************
*** 10576,10581 ****
--- 10582,10588 ----
      print("pt");
      end;
    end;
+   XeTeX_Uchar_code: print_char(cur_val);
  @/@<Cases of `Print the result of command |c|'@>@/
  job_name_code: print_file_name(job_name, 0, 0);
  end {there are no other cases}
***************
*** 30000,30005 ****
--- 30007,30014 ----
  XeTeX_selector_name_code: print_esc("XeTeXselectorname");
  XeTeX_glyph_name_code: print_esc("XeTeXglyphname");
  
+ XeTeX_Uchar_code: print_esc("Uchar");
+ 
  @ @<Cases of `Scan the argument for command |c|'@>=
  eTeX_revision_code: do_nothing;
  pdf_strcmp_code:

Attachment: xetexuchar.tex
Description: TeX document

Attachment: xetexuchar.xetex.log
Description: Binary data

Attachment: xetexuchar.luatex.log
Description: Binary data

Attachment: nonbmp.tex
Description: TeX document

Attachment: nonbmp.xetex.log
Description: Binary data

Attachment: nonbmp.luatex.log
Description: Binary data


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Reply via email to