URL: <https://savannah.gnu.org/bugs/?66653>
Summary: .asciify odd behaviour / adding pdf features to ms Group: GNU roff Submitter: deri Submitted: Fri 10 Jan 2025 05:13:24 PM UTC Category: Core Severity: 3 - Normal Item Group: Warning/Suspicious behaviour Status: None Privacy: Public Assigned to: None Open/Closed: Open Discussion Lock: Any Planned Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Fri 10 Jan 2025 05:13:24 PM UTC By: Deri James <deri> First I will explain what I am trying to accomplish, before describing the issue. With the demise of pdfmark.pdf and mspdf.tmac I realised that creating a replacement would be a lot easier if we did a similar job Branden did for man but this time for ms. Branden is correct that Keith's original pdf work is more unixy than roffish, using commands on a single line and various flags, and his mspdf.tmac provides new commands which are wrappers around existing ms commands with added pdf extensions. So instead of:- .NH 1 Introduction It supports:- .NH 1 .XN Introduction Apart from having to learn new commands it is also less flexible. Ms allows:- .NH 1 Introduction to .I groff Which would have to be converted to inline font changes to work with .XN. It also means that existing ms documents can't magically start using pdf features without considerable editing. The gold standard "solution" would be if the output was pdf then pdf features are automatically included, the same as Branden did with man. As a proof of concept I attempted to make .NH produce a document outline as well as headings. I used ms.ms as a test document, using the command "test-groff -Tpdf -ms -M. -mpdfms -pet -ww ../doc/ms.ms > msdj.pdf", the result is attached. Seems to like pic.ms too. Now onto the problem with .asciify. Because I have to use a diversion to capture the line(s) after the .NH the diversion contains nodes, fine when the diversion is output as a heading but needs to be converted back to text to be used for the bookmark. Traditionally this was done by calling .asciify which converted glyph nodes and word space nodes back to text:- .{type: glyph_node, character: "F", diversion level: 1}, {type: glyph_node, character: "i", diversion level: 1}, {type: glyph_node, character: "r", diversion level: 1}, {type: glyph_node, character: "s", diversion level: 1}, {type: glyph_node, character: "t", diversion level: 1}, {type: word_space_node, diversion level: 1}, But, oddly, fails to convert:- {type: glyph_node, character: "\u260E", diversion level: 1}, Back to \[u260E], possibly the code was written before the advent of preconv. Now, the issues this code uncovered. 1. Near the top of the pdfms.tmac there are redefinitions of .B and .BI which replace the ones which are generated when s.tmac is loaded. If you comment out the definitions and produce the pdf there are no errors/warnings but some overview entries are truncated. You can see this in the two subsidiary entries to "Legacy Features" which both get truncated to one word by asciify if the s.tmac version of .I is used, but not if using the redefined version. One difference is that the s.tmac versions introduce italic correction escapes which may be what is upsetting asciify. 2. The shw macro just calls pdfbookmark, so largely redundant, why not call pdfbookmark directly instead of shw. If you replace shw with a direct call to pdfbookmark, groff coredumps with an assert failure. I believe this is the same as a bug number we already have, but the reproducer started working after removal of asciify from pdf.tmac. So I have also attached a minimum example (it.trf) which reproduces the coredump if you use pdfms.tmac with the direct call to pdfbookmark rather than the indirection though shw. This can all be put on the back burner until after the current release I'd just appreciate an affirmation that we agree this is the best way forward in replacing mspdf.tmac. I started on .NH because I think if we have a workable solution for that I think it is likely all pdf features can be slotted in without a separate api, which is what Keith used. Any good ideas, or better approaches, would be appreciated. _______________________________________________________ File Attachments: ------------------------------------------------------- Name: pdfms.tmac Size: 341B <https://file.savannah.gnu.org/file/pdfms.tmac?file_id=56763> ------------------------------------------------------- Name: it.trf Size: 79B <https://file.savannah.gnu.org/file/it.trf?file_id=56764> ------------------------------------------------------- Name: msdj.pdf Size: 75KiB <https://file.savannah.gnu.org/file/msdj.pdf?file_id=56765> AGPL NOTICE These attachments are served by Savane. You can download the corresponding source code of Savane at https://savannah.gnu.org/source/savane-60bab5f4bb97079a4f50fd30c5633b2562628477.tar.gz _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66653> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature