URL:
  <https://savannah.gnu.org/bugs/?66653>

                 Summary: .asciify odd behaviour / adding pdf features to ms
                   Group: GNU roff
               Submitter: deri
               Submitted: Fri 10 Jan 2025 05:13:24 PM UTC
                Category: Core
                Severity: 3 - Normal
              Item Group: Warning/Suspicious behaviour
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Fri 10 Jan 2025 05:13:24 PM UTC By: Deri James <deri>
First I will explain what I am trying to accomplish, before describing the
issue.

With the demise of pdfmark.pdf and mspdf.tmac I realised that creating a
replacement would be a lot easier if we did a similar job Branden did for man
but this time for ms. Branden is correct that Keith's original pdf work is
more unixy than roffish, using commands on a single line and various flags,
and his mspdf.tmac provides new commands which are wrappers around existing ms
commands with added pdf extensions. So instead of:-

.NH 1
Introduction

It supports:-

.NH 1
.XN Introduction

Apart from having to learn new commands it is also less flexible. Ms allows:-

.NH 1
Introduction to
.I groff

Which would have to be converted to inline font changes to work with .XN. It
also means that existing ms documents can't magically start using pdf features
without considerable editing. The gold standard "solution" would be if the
output was pdf then pdf features are automatically included, the same as
Branden did with man.

As a proof of concept I attempted to make .NH produce a document outline as
well as headings. I used ms.ms as a test document, using the command
"test-groff -Tpdf -ms -M. -mpdfms -pet -ww ../doc/ms.ms > msdj.pdf", the
result is attached. Seems to like pic.ms too.

Now onto the problem with .asciify. Because I have to use a diversion to
capture the line(s) after the .NH the diversion contains nodes, fine when the
diversion is output as a heading but needs to be converted back to text to be
used for the bookmark. Traditionally this was done by calling .asciify which
converted glyph nodes and word space nodes back to text:-

.{type: glyph_node, character: "F", diversion level: 1},
{type: glyph_node, character: "i", diversion level: 1},
{type: glyph_node, character: "r", diversion level: 1},
{type: glyph_node, character: "s", diversion level: 1},
{type: glyph_node, character: "t", diversion level: 1},
{type: word_space_node, diversion level: 1},

But, oddly, fails to convert:-

{type: glyph_node, character: "\u260E", diversion level: 1},

Back to \[u260E], possibly the code was written before the advent of preconv.

Now, the issues this code uncovered.

1. Near the top of the pdfms.tmac there are redefinitions of .B and .BI which
replace the ones which are generated when s.tmac is loaded. If you comment out
the definitions and produce the pdf there are no errors/warnings but some
overview entries are truncated. You can see this in the two subsidiary entries
to "Legacy Features" which both get truncated to one word by asciify if the
s.tmac version of .I is used, but not if using the redefined version. One
difference is that the s.tmac versions introduce italic correction escapes
which may be what is upsetting asciify.

2. The shw macro just calls pdfbookmark, so largely redundant, why not call
pdfbookmark directly instead of shw. If you replace shw with a direct call to
pdfbookmark, groff coredumps with an assert failure. I believe this is the
same as a bug number we already have, but the reproducer started working after
removal of asciify from pdf.tmac. So I have also attached a minimum example
(it.trf) which reproduces the coredump if you use pdfms.tmac with the direct
call to pdfbookmark rather than the indirection though shw.

This can all be put on the back burner until after the current release I'd
just appreciate an affirmation that we agree this is the best way forward in
replacing mspdf.tmac. I started on .NH because I think if we have a workable
solution for that I think it is likely all pdf features can be slotted in
without a separate api, which is what Keith used.

Any good ideas, or better approaches, would be appreciated.








    _______________________________________________________
File Attachments:


-------------------------------------------------------
Name: pdfms.tmac  Size: 341B
<https://file.savannah.gnu.org/file/pdfms.tmac?file_id=56763>
-------------------------------------------------------
Name: it.trf  Size: 79B
<https://file.savannah.gnu.org/file/it.trf?file_id=56764>
-------------------------------------------------------
Name: msdj.pdf  Size: 75KiB
<https://file.savannah.gnu.org/file/msdj.pdf?file_id=56765>

    AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://savannah.gnu.org/source/savane-60bab5f4bb97079a4f50fd30c5633b2562628477.tar.gz

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66653>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to