Hi Deri, Thanks for your reply--I think we are making some headway toward a meeting of the minds here.
At 2024-11-18T13:52:01+0000, Deri wrote: > On Sunday, 17 November 2024 22:17:08 GMT G. Branden Robinson wrote: > > 1. What are your acceptance criteria for a 1.24.0 release > > candidate? > > 2. Who do you trust besides yourself to address any of the 3 bugs > > listed above? > > > > How do you answer these? > > It looks like comprehension may be an issue here. I understood you > were asking who I trusted to make a good job of documenting the > information contained in pdfmark.pdf as it pertained to creating pdfs > with groff, as I had already answered that as PART of my response > elsewhere, But not only that. I also asked you who you trust to make modifications of the formatter to implement, for example, transformations of the argument to `device` requests and `\X` escape sequences to facilitate encoding of, among other things, PDF bookmark titles that might need to express non-ASCII character code points. (To be more precise, in #64484 we're arguing over which non-special-character escape sequences in such an argument--like `\0` and `\v`--should also undergo some sort of transformation, and how.) I furthermore asked you who you trust to make modifications to gropdf. I'll take this opportunity to clarify: I mean not just "gropdf.pl", but "pdfmom.pl" for that matter. I had assumed, for example, that you were not in any way perturbed by the following commit. commit b62f0605427caa156f23c81280f345496dd8a5cc Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Thu Sep 26 17:26:16 2024 -0500 [pdfmom]: Handle errors and abnormal exits. * src/devices/gropdf/pdfmom.pl: Handle signaled and error exits from groff pipeline. Improve diagnostics. Store "basename" of executing command into `prog` scalar. Gather the wait status returned by every use of `system()`. (abort): New subroutine writes a fatal diagnostic message and exits with status 1. (autopsy): New subroutine decodes a POSIX wait status to determine the fate of an unlucky process, and returns it as a human-readable string scalar. (The interesting part is cribbed from Perl documentation.) Also discard trailing whitespace from lines. Also update Vim modeline to reflect indentation practice in evidence. I managed to introduce a SEGV into tbl (that didn't show up with simple inputs) in my working copy. To my dismay, the build just kept right on trucking after failing to build "groff-man-pages.pdf", because pdfmom ignored the return value of `system()`, concealing errors. I don't think we can accept that; we need to fail fast and fail hard. If I hadn't noticed the piteous "Segmentation fault" error scroll by, I might have overlooked my mistake for a long time. ...but now I am less certain. This is why I keep asking you (for at least the third time) whether you want an arrangement similar to the one Peter Schaffter enjoys with mom(7), where the groff developers don't make commits to "om.tmac" or the files in "momdoc" without prior arrangement with him. Despite having put it to you multiple times, you never answer this question, be it in the affirmative or the negative. Instead we just roll on with the status quo and occasionally you express shock or amazement at my ignorance, brazenness, or other character flaw. That response when you could, for instance, file a Savannah ticket (or just fire off on an email to this list) much like the following: "Hey Branden, In your commit abcdef012345 I think you might have regressed something. Here's my input: echo 'whatever' | pdfmom > foo.pdf Before that commit, I got XXX, but now I get YYY. Was this intentional?" I think that's a more efficient mechanism of addressing software defects than psychologizing or character profiling. (There is a place for those, but I don't think it's a good first choice among colleagues.) Maybe that's just me. > I have no doubt you would make a great job replacing pdfmark.ms, > combining stuff from gropdf.1 and comments in pdf.tmac. In fact I know > you are so thorough it is likely you are likely to find plenty of bugs > around the seldom used edges of pdf.tmac. > > Which to my mind is an effusive endorsement of your ability, and a > clear answer to the question of who do I trust to do a good job. ...with documentation. Your view of my ability to avoid introducing bugs is considerably less rosy, and your messages seem to me to take on an angry tone when I do so in a part of the groff code base that is especially important to you. Possibly I have a different model of defect genesis than you do. I think developers inevitably and pretty much constantly introduce bugs. Often, these are due to underspecification of the desired behavior, whether formally or simply in the mind of the programmer. To a large extent I don't think we can control the introduction of bugs. (That statement may shock people. Keep reading.) I think it's a common fallacy among journeyman programmers that, through sufficient discipline, like following every bullet point identified by Steve McConnell in _Code Complete_ [there are over 100 in total], and/or achievement of sufficient mastery of a programming language, complete with an ability to unpromptedly spout citations by section number to an applicable ISO/IEC/IEEE standards document, they will reach a point where they never introduce bugs. They're just that stone cold good, dawg/brah. While I would not claim that's impossible--people are capable of amazing things, sometimes at the cost of development in other areas--I think it's a bad premise from which to proceed. People will make mistakes. That's inevitable, and ultimately we can't control that. What we can control, to a significant degree, is the _timing_ of mistake discovery. I don't know why I'm the only person since Bertrand to have written automated tests for groff, but the fact that I've written 224 of them to date is because I absolutely do not trust myself not to make mistakes in programming. And I don't trust anyone else not to, either. But that doesn't mean I think they're incompetent programmers. I think they're human beings modifying complex systems that are almost never as rigidly specified, at any level all the way down to the hardware (wave hello to Spectre), as we typically pretend. How do we know that a change is good, and lacks defects? Not because it passes peer review, or because the author is known to be a domain expert, but because it passes empirical tests such that actual results match the expected results. So if I ever upset someone with a bug I've introduced, I'd ask them to take at least some of the time and energy they would spend excoriating me for doing it, and channel that into constructing an automated test exercising the code path in question. I don't as a rule push a set of commits that fail our test suite. The exceptions, as I believe I've said multiple times before, are (1) when I write a new regression test, to (try to) illustrate that the test exercises the erroneous code path that I think it does (and which then passes when the fix is committed), and (2) when making some sort of infrastructural changes that break the build and it is impossible, or would obfuscate illuminating details, to coalesce the changes into one commit. When either of these happens, I annotate the commit message appropriately. Case 1 happens with some frequency, and as yet no one has flown screeching into this mailing list with a condemnation of groff developers as incompetent idiots as a result. Knock wood. Case 2 has happened once, I think, in the last several thousand commits. Again, both cases were warned about in the commit message, and neither has yet drawn reproach. > If you are unable to do the job an alternative expedient would be be > to reinstate just pdfmark.ms with a big red notice at the top > explaining that its retention is an interim measure until something > more permanent can be produced. I feel capable. Just need to find some time. > > Since copy and pasting a URL would have been even more labor-saving, > > I infer an additional goal: to air to a broader audience your low > > assessment of my ability and/or my judgment. > > I'm afraid here my comprehension is struggling. It is not clear to me > how cut and pasting a URL involves less labour than cut and pasting an > adjacent block of text, which has the added advantage of being a > specific answer to the question of who I trusted to do the > documentation, and an answer to your assertion that I consider you of > low intellect, merely pasting a URL would have required reading the > entire post to find the relevent answers. See above. I acknowledge that you're willing to give up some plaudits to my abilities as a documentarian. And I appreciate those. But there is certainly something you find questionable about my abilities or judgment with respect to code changes, or we wouldn't keep getting into heated arguments over them. So, please, accept my invitation to tell me exactly how I suck as a programmer. If nothing else it'll go well with popcorn for the spectators. But I am prepared to learn from you. > I do note that you have opened more bugs on this issue, was this to > reduce the visibility of our discussion? (I don't believe that but you > did try and put a particular slant on my decision to include the > relevent part of a bug entry to answer a question you posed here). As you probably guessed, that was to mark them as to-do items prior to release (or RC, though I would argue that, consistently with groff's past release management history, not _all_ documentation changes need be made before the first RC). > Perhaps it would help to have the knots defined. > > 1) Should the release happen with a chunk of documentation on how to > produce pdfs with groff missing? > > Sub-divided into:- > > A) Reinstate a version of pdfmark.pdf, or > > B) A new document. > > My preference is (A) since, as a minimum, we are back where we were > before Keith's contributions were removed, at least in documenting > groff's ability to produce pdfs. (B) could be a target for a later > release. Is adding material to gropdf(1) covering macros defined in "pdf.tmac" that aren't already documented there satisfactory to achieve (A)? > 2) How should "\0" be treated by \X and .device? > > A) Treat the same as "\ " and "\~" i.e. replace with a text space. > > B) Keep as current code, removal and issue a warning (for \X) or no warning > and replace with "0" (for .device). > > My preference again is (A), but Branden is against this solution. Right, and I've explained why. Tenting my fingers, I'll sneakily hide that explanation behind the URL https://savannah.gnu.org/bugs/?64484#comment32 where it shall lie undiscovered for centuries. I don't endorse sub-option B.2 ["no warning and replace with "0" (for .device)]. That would, in my opinion, be wrong. The bug came about because I misunderstood exactly what I could expect to get back from the `get_copy()` function when it reads an escape sequence. Here's the change I have pending. diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp index 8fafa8a0a..25dbd2968 100644 --- a/src/roff/troff/input.cpp +++ b/src/roff/troff/input.cpp @@ -6016,14 +6016,22 @@ static void device_request() // Null characters can correspond to node types like vmotion_node that // are unrepresentable in a device extension command, and got scrubbed // by `asciify`. - for (int prevc = c; (c != '\0') && (c != '\n') && (c != EOF); + for (; + (c != '\0') && (c != '\n') && (c != EOF); c = get_copy(0 /* nullptr */)) { - if (!('\\' == c) && (prevc != '\\')) + if (c != '\\') mac.append(c); else { int c1 = get_copy(0 /* nullptr */); - if (c1 != '[') - mac.append(c1); + if (c1 != '[') { + if (csprint(c1)) + warning (WARN_SYNTAX, "ignoring escaped '%1' in device" + " extension command request argument", char(c1)); + else + warning(WARN_SYNTAX, "ignoring escaped control character" + " (code %1) in device extension command request" + " argument", c1); + } else { // Does the input resemble a valid (bracket-form) special // character escape sequence? diff --git a/src/roff/troff/troff.1.man b/src/roff/troff/troff.1.man index 5a633658e..06c3a230b 100644 --- a/src/roff/troff/troff.1.man +++ b/src/roff/troff/troff.1.man @@ -755,6 +755,8 @@ .SH Warnings A self-contradictory hyphenation mode was requested; an empty or incomplete numeric expression was encountered; an operand to a numeric operator was missing; +a recognized but inapposite escape sequence was used +in a device extension command request; an attempt was made to define a recursive, empty, or nonsensical character class; What remains is to write a regression test for it. I solicit opinions, from anyone, on the foregoing. Here's the behavior of the `device` request after this change. $ printf '.device ps:exec A B\\~C\\ D\\0E\\|F\\^G\\h".5m"I\\v".5m"J\n' | ./build/test-groff -Zww | grep '^x X' troff:<standard input>:1: warning: ignoring escaped '0' in device extension command request argument troff:<standard input>:1: warning: ignoring escaped 'h' in device extension command request argument troff:<standard input>:1: warning: ignoring escaped 'v' in device extension command request argument x X ps:exec A B\~C\ DE\|F\^G".5m"I".5m"J `get_copy()` behaves in an interesting way. `\0` is treated like a motion (like `\h` and `\v`), but `\~`, `\space`, `\|`, or `|^` aren't. None of them are documented as "being interpreted even in copy mode",[1] and for things like strings and macro definitions, they don't seem to have ever caused problems. It is attempting to put them out as device extensions in "grout" that forces us to confront these questions. One might wonder, what did earlier versions of groff do, before I came along and ruined everything with my unthinking, spasmodic recklessness? $ printf 'foobar\n.device ps:exec A B\\~C\\ D\\0E\\|F\\^G\\h".5m"I\\v".5m"J\n' | ~/groff-1.22.3/bin/groff -Zww | grep '^x X' x X ps:exec A B\~C\ D\0E\|F\^G\h".5m"I\v".5m"J $ printf 'foobar\n.device ps:exec A B\\~C\\ D\\0E\\|F\\^G\\h".5m"I\\v".5m"J\n' | ~/groff-1.22.4/bin/groff -Zww | grep '^x X' x X ps:exec A B\~C\ D\0E\|F\^G\h".5m"I\v".5m"J $ printf 'foobar\n.device ps:exec A B\\~C\\ D\\0E\\|F\\^G\\h".5m"I\\v".5m"J\n' | ~/groff-stable/bin/groff -Zww | grep '^x X' x X ps:exec A B\~C\ D\0E\|F\^G\h".5m"I\v".5m"J So there _is_ a difference...older groffs put out `\0`, `\h`, and `\v` for an input `\0`, `\h`, and `\v`, and with my pending change, nothing will be put out for them, but you'll get a warning (if enabled). (Why the leading "foobar"? <https://savannah.gnu.org/bugs/?65977>.) But I can change the output difference back, like this: $ printf '.device ps:exec A B\\~C\\ D\\0E\\|F\\^G\\h".5m"I\\v".5m"J\n' | ./build/test-groff -Zww | grep '^x X' troff:<standard input>:1: warning: not interpreting escaped '0' in device extension command request argument troff:<standard input>:1: warning: not interpreting escaped 'h' in device extension command request argument troff:<standard input>:1: warning: not interpreting escaped 'v' in device extension command request argument x X ps:exec A B\~C\ D\0E\|F\^G\h".5m"I\v".5m"J The alternative diff is simple: diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp index 8fafa8a0a..4743b36a7 100644 --- a/src/roff/troff/input.cpp +++ b/src/roff/troff/input.cpp @@ -6016,14 +6016,24 @@ static void device_request() // Null characters can correspond to node types like vmotion_node that // are unrepresentable in a device extension command, and got scrubbed // by `asciify`. - for (int prevc = c; (c != '\0') && (c != '\n') && (c != EOF); + for (; + (c != '\0') && (c != '\n') && (c != EOF); c = get_copy(0 /* nullptr */)) { - if (!('\\' == c) && (prevc != '\\')) + if (c != '\\') mac.append(c); else { int c1 = get_copy(0 /* nullptr */); - if (c1 != '[') + if (c1 != '[') { + mac.append(c); mac.append(c1); + if (csprint(c1)) + warning (WARN_SYNTAX, "not interpreting escaped '%1' in device" + " extension command request argument", char(c1)); + else + warning(WARN_SYNTAX, "not interpreting escaped control character" + " (code %1) in device extension command request" + " argument", c1); + } else { // Does the input resemble a valid (bracket-form) special // character escape sequence? Which do you prefer? > (pretty simple decisions, not Gordian at all). I agree. So relax--next time I screw up, *no matter where in the tree*, try using template I presented above: "Hey Branden, In your commit abcdef0123 I think you might have regressed something. Here's my input: echo 'whatever' | pdfmom > foo.pdf Before that commit, I got XXX, but now I get YYY. Was this intentional?" Regards, Branden [1] groff(7): Escape sequence short reference The escape sequences \", \#, \$, \*, \?, \a, \e, \n, \t, \g, \V, and \newline are interpreted even in copy mode.
signature.asc
Description: PGP signature