On Saturday, 16 November 2024 15:34:36 GMT G. Branden Robinson wrote: > Hi Deri, > > At 2024-11-16T13:02:02+0000, Deri wrote: > > On Thursday, 14 November 2024 17:44:55 GMT G. Branden Robinson wrote: > > You did commit to adding relevent info from the attached pdf regarding > > the significant changes to gropdf, is this still your intention? > > > > There is now also a gap in the documentation of adding pdf features to > > groff documents (since you have removed pdfmark.ms from the groff > > repository). This served a dual purpose of documenting the pdfmark API > > used by both pdfroff and -Tpdf. > > Well, sort of. I found the API to be inadequately documented in > contrib/pdfroff, too. > [...] > > ...or the last 2 pages of the generated pdfmark.pdf document. > > (Even if the in-band singaling was your invention, I have found that in > no situation where I was looking at pdf.tmac and failed to find > documentation I desired, did pdfmark.tmac or pdfmark.pdf leave me any > the wiser.)
There are many examples dotted around where you have opined about your difficulty understanding the code in pdf.tmac and pdfmark.tmac, further evidenced by you actually breaking the code whilst I was on my sabbatical. >From bug #65585:- ========================================================================== I have been looking at the current state of pdf.tmac and have found a few issues. The changes which have been committed are:- Introduction of a new flag, -S, which replaces my change which introduced the concept of passing a single pipe character as a hotspot meant that the actual text to form the actual hotspot would be "piped" to the document stream, terminated by doing a markstop. This is what allowed Branden to implement the man macros which place contents into a diversion before emitting a hotspot. I have no objection to this change, although there were several ways of setting up a hotspot containing a single pipe character before this change, and most users are probably familiar with piping data. (Perhaps I should have used "<" as the single character). I would be very annoyed if someone sent me a pdf with a single "|" as a hotspot - would take me about 5 minutes to click it with the mouse! Another change was an attempt to only allow a restart after a preceding suspend. This is a good idea, but it is in the wrong place, it should be implemented in gropdf rather than pdf.tmac, since a user may use \X'pdf: markrestart' as documented in the gropdf man page. Which bypasses the check in pdf.tmac. Branden has implemented a looping construct to hold tags replacing an associative array. It is not used if the user is using mom macros (it does not work for mom documents). In fact it only works reliably for the man macros producing a "book" of man pages, where it is used to differenciate between an internal link to another page in the book or a "man:" URL to an external man page. The problem is that the new code does not implement:- .pdfhref L -D name Which should emit the descriptive text associated with the named tag at the time it was defined, but instead it just emits the "name". This is documented in pdfmark.pdf, and used several times in pdfmark.ms, so it is reasonable to assume users may be using this in their own documents if they use "pdfmom -- roff -mspdf". So it is wrong to assume the new code is suitable for everything except mom. There is also an issue with the speed of the new code. One test file I used went from an elapsed time of 0m3.08s to 11m31.62s. (About 670 times slower!). This was producing a large document (LinuxManBook) from a single file, using the command:- time ~/groff-git/groff/test-pdfmom --roff -Tpdf -mandoc -petk LinuxManBook.trf -z (test-pdfmom is the same as test-groff but calling the pdfmom in the build directory). I don't know whether this slow down in groff will be acceptable to most users, I'm meant to be on a sabatical so I am not going to argue either way, but be aware that is Branden's attempt to solve one issue which was solved months ago in my branch waiting for merging. My use of stringhex was described as "obfuscation", which is rather insulting if the intention was to imply a deliberate attempt to obscure the purpose of my code. The correct interpretation of its use is to protect the data from interpretation by groff so that any byte sequence can be used as a string name in groff. This is analagous to using base64 to protect binary data in emails. Although Branden's loopy code solves some issues with 1.23.0 it fails in a number of ways which are dealt with successfully by the pdf.tmac in my branch (i.e. t.trf in #64576). There is a bug in Branden's code. The attached file, pdf-L.trf, illustrates the issue. According to the documentation in pdfmark.pdf, if you use .pdfhref L with a destination name but no descriptive text, the descriptive text given when the destination is named is used. With Branden's code, instead of using the descriptive text it uses the name of the destination instead. The two pdfs called pdf-L illustrate the problem. Another bug results in entries in the array Branden loops over get over- vritten in certain circumstances. This code illustrates the bug. .ig groff -Tpdf -dPDF.EXPORT=1 -z pdf-M.trf Results:- .ds pdf:bm1.tag one .ds pdf:bm1.tag two (The bm1.tag has been overwritten!!!) But, groff -Tpdf -dPDF.EXPORT=1 -dPRINTSTYLE=1 -z pdf-M.trf Results:- .ds pdf:look(one) Once upon... .ds pdf:look(two) This is two .. .pdfbookmark -T one 1 Once upon... .pdfhref M -D two -E This is two Setting "PRINTSTYLE=1" bypasses Branden's changes, because he uses the original code if mom macros are used. I don't understand why it is desirable to have two separate methods. I have fixed all the problems listed above, plus a speed up of Branden's loopy code. I have also included Keith's clever solution to polluting the pdf flags with data which can cause the errors about illegal characters in identifier names if the optional "--" marker not used. [...] I am enjoying my sabbatical away from groff, but I thought I should correct these recent changes, since they would cause issues if released, but I would be grateful if other major changes to pdf production could wait until I am back, fighting fit to do any QA. Of course, if you find any problems with the code, please let me know through this bug report. ============================================================================= > [rearranged] > > > are you expecting me to write something at extremely short notice? > > No. > > > What are your intentions regarding filling this gap, > > Unclear. I can live with either of: > > 1. leaving the API as documented as it is in gropdf(1) (and > "internally" in pdf.tmac) for this release; or Given your difficulties understanding the code in pdf.tmac, I am surprised you are espousing this as a source of documentation!! > 2. writing documentation myself of whatever parts of the API you > identify to me as essential and don't want to overhaul for groff > 1.25. It looks like something like:- Were documented in pdfmark.ms .pdfview .pdfbookmark .pdfhref .pdfinfo .pdfnote PDFHREF.VIEW PDFHREF.VIEW.LEADING PDFOUTLINE.FOLDLEVEL Documented in pdf.tmac Several user configurable strings (e.g. PDFHREF.COLOUR) Not documented!! .pdfclean Documented in gropdf.1 .pdfpagename .pdfpause .pdfswitchtopage .pdftransition .pdfbackground .pdfmarkrestart .pdfmarksuspend I will try to annotate pdfmark.pdf to indicate specific differences with - Tpdf. > It looks to me like there is much less ground to cover in pdf.tmac: > > $ wc -l tmac/pdf.tmac ~/groff-stable/share/groff/1.23.0/tmac/pdfmark.tmac > 876 tmac/pdf.tmac > 1953 /home/branden/groff-stable/share/groff/1.23.0/tmac/pdfmark.tmac You'd be surprised a lot of the code in pdfmark.tmac is doing stuff which pdf.tmac leaves to gropdf, but the API is the same. > I also think that you not should feel yourself bound by Keith's design > choices. The design of pdfmark postscript extension was done by Adobe, so I won't be changing that. > Regards, > Branden