Update of bug #66653 (group groff): Status: None => Need Info
_______________________________________________________ Follow-up Comment #21: Hi Branden, You are so close. :-) The dummy node no longer truncates the asciified text. Unicode characters are converted to \[uXXXX] form. The only issue left is that composite input characters (as decided by NFD) are not successful. If you send the asciified diversion to the document stream the characters are completely missing and a strange error is reported, if it is sent via device control only the base character is converted to \[uXXXX] (not \[uXXXX_XXXX]). Using this:- .ft Tinos-Regular .sp 1i0 .ds khant "time to meet the ShǍka Khǎn .box DIV \*[khant] .br .box .DIV .asciify DIV .DIV .pdfbookmark 1 "\*[DIV] .pdfbookmark 1 "\*[khant] .\" in Tinos-Regular .\" u0041_030C 722,859 2 814 uni01CD .\" u0061_030C 443,696,10 2 815 uni01CE Two errors are shown:- [derij@pip build (master)]$ test-groff -Tpdf -F font -k G.trf -Z | egrep "^x|C" troff:G.trf:10: warning: special character '\A' not defined troff:G.trf:10: warning: special character '\a' not defined x T pdf x res 72000 1 1 x init x X ps:exec [/Dest /pdf:bm1 /View [/FitH -67000 u] /DEST pdfmark x X ps:exec [/Dest /pdf:bm1 /Title (time to meet the Sh\[u0041]ka Kh\[u0061]n) /Level 1 /OUT pdfmark x font 40 Tinos-Regular Cu0041_030C Cu0061_030C x X ps:exec [/Dest /pdf:bm2 /View [/FitH -91000 u] /DEST pdfmark x X ps:exec [/Dest /pdf:bm2 /Title (time to meet the Sh\[u01CD]ka Kh\[u01CE]n) /Level 1 /OUT pdfmark x trailer x stop I have also included a screen shot of the pdf produced. You can see the missing characters in the document and the same diversion used in the bookmark has the base character only. The most interesting thing to note is that the original string register \*[khant] retains the original input value when used in a bookmark (I was super-pleased when you got that working). As far as I know the special character '\A' does not exist as a groff name, it is not mentioned in groff_char. In Tinos the postscript name is uni10CD, but other fonts have the postscript name as "Ahacek" (which is not in our glyph list). I'm dealing with groff's problems with more modern fonts in my forthcoming reply to bug #67244, which is taking an age to write as I keep getting sidetracked into writing little utility programs discovering the innards of ttf and otf fonts. If you want to experiment with a font you have installed, try:- .ft U-TR .sp 1i0 .ds khant "time to meet the Şhaka Khan At the top of the file and include "-Kutf8" on the command line. This remaining problem concerning composite characters is not a show stopper, it only raises its ugly head when we add pdf features to ms (me?) in 1.25. Cheers Deri _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66653> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature