Follow-up Comment #7, bug #66122 (group groff): [comment #5 comment #5:] > One of the changes that was never committed from my deri-gropdf-ng branch was to om.tmac, which removed the calls to pdfmomclean IF using -T pdf, but retained the calls for -T ps.
Ah, yes, it'd be good to land that. > Text in pdfs, such as Author, Title and Bookmarks, can be presented either as PDFDocEncoding (a superset of ISO Latin-1) or UTF-16. > > Gropdf can handle being passed \[uXXXX] as text and converts this to UTF-16, but grops can't, it treats it as plain text. I believe this point leads directly to bug #62830. > This is why pdfmomclean calls asciify to convert \[u00E9] back to "'e" which grops handles. > > I have attached the relevent changes from my old branch for om.tmac, if Peter agrees to apply them. These look good to me. > I believe Branden is in favour of having all groff input clean 7 bit ascii and encourage the use of preconv for anything else, To reduce the level of mojibake confusion when we cut GNU _troff_ over to interpreting UTF-8 input directly, yes. > so this patch will stop the warning appearing for pdf output but permit pdfroff to be used to create mom documents. As a bonus this patch (plus a change to pdfmom just committed) will now allow mom documents like this:- > > .HEADING 1 NAMED Гуляйпольщина "Гуляйпольщина или Махновщина" > .PDF_LINK Гуляйпольщина PREFIX ( SUFFIX ) "see: +" > > Thus allowing mom to be used in many languages (assuming you have suitable fonts) and take over the world. :-) Sounds like a big win to me! > There is another of these "transparent device whatchamacallits" when generating groff-man-pages.pdf (from groff_mmse.7) which can be fixed by adding "-K utf8" to the pdfmom call in doc/doc.am. I don't think that's quite right. The `-K` option runs _preconv_ and clues it in to the name of the _input_ encoding. -K enc Set input encoding used by preconv(1) to enc; implies -k. So what we need here is `-K latin-1`. And sure enough that works fine. $ groff -K latin-1 -m an -T pdf -Z ./build/contrib/mm/groff_mmse.7 -Z | grep Title x X ps:exec [/Dest /groff_mmse(7) /Title (groff_mmse(7)) /Level 1 /OUT pdfmark x X ps:exec [/Dest /pdf:bm2 /Title (Namn) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm3 /Title (Syntax) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm4 /Title (Beskrivning) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm5 /Title (Brev) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm6 /Title (Skrivet av) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm7 /Title (Filer) /Level 2 /OUT pdfmark x X ps:exec [/Dest /pdf:bm8 /Title (Se ocks\[u00E5]) /Level 2 /OUT pdfmark _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66122> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature