Follow-up Comment #1, bug #66686 (group groff):
> ------------------------------------------------------- > Date: Mon 20 Jan 2025 12:09:37 PM CST By: Dave <barx> > Modifying an example from the info manual to change the delimiters for > \w's argument to a + sign: > > $ echo "The length of the string 'abc' is \w+abc+u." | groff -Tascii \ > | cat -s > The length of the string 'abc' is 72u. I feel that this report is a close sibling of Paul Eggert's regarding the `|` character. I walked that one back because `|` is known in the wild as a delimiter. > This has worked for decades in groff, and works in Heirloom troff. Yes, probably. > It has no syntactical ambiguity, This point is arguable, but it has _human_ ambiguity... > because \w does not take a numeric expression. Agreed, it does not. But it might _appear in one_. (I see multiple examples in each of "s.tmac", "m.tmac", and "om.tmac".) > In a bleeding-edge groff, it no longer works. > > $ echo "The length of the string 'abc' is \w+abc+u." \ > | groff-latest -Tascii | > cat -s > troff:<standard input>:1: error: character '+' is not allowed as a > delimiter > The length of the string 'abc' is abc+u. > > And the manner in which it fails--transforming something that used to > be a numeric expression into something that (probably) no longer > is--sends things further off the rails when calculations are attempted > with this value. Exactly! .nr desired-len 2n+\w+abc+u .nr another-desired-len \w+abc+u+2n .nr yet-another-desired-len \w+abc++2n .nr double-desired-len \n[desired-len]+\w+abc++2n At some point this gets too crazy. When Clark implemented GNU troff, he clearly _intended_ to take some delimiters off the table. > This change dates from the last 5 months. In that time frame: > > * Bug #66481 fixed a similar back-compatibility-breaking problem with > the \w escape and a | delimiter. But this couldn't have introduced > the current problem as a side effect: all it does is revert the fix > for bug #66009 fix, and the + delimiter worked fine in the pre-66009 > days. I'd link to https://git.savannah.gnu.org/cgit/groff.git/tree/?h=1.23.0 but Savannah's been undergoing a DDoS for the past several days and the service is unavailable. > * Bug #63142 disallowed newlines as delimiters, which was an > un(der)used GNU aberration from the start. But its commit message > says "As a bonus, check starting delimters [sic] for these escape > sequences (`\[obAZwX]`) for validity in general." So I suspect this > bug may have been introduced as part of that bonus. I haven't bisected it down, but here's the commit that produced the forbidden delimiters list in its current form. commit 61141803ffbb9cc7ed8c27744bc8689c43a953d2 Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Thu Mar 7 07:08:38 2024 -0600 [troff]: Refactor (is_char_usable_as_delimiter). * src/roff/troff/input.cpp: Refactor. Pull delimiter character validator into its own function operating on a character, rather than on an object of the token class. (is_char_usable_as_delimiter): New function compares `char` parameter to list of valid delimiters. (token::is_usable_as_delimiter): Refactor to call the foregoing. My intent with that commit was not to forbid more delimiters (hence my use of the word "refactor"), and I'm not sure that it did. I'll need to bisect. However I may be tempted to just "NEWS" this because I don't think many people use expression operators as delimiters. I was wrong about `|` in the wild, and indeed it's a pretty obscure symbol as numerical expression operator. Enough that it doesn't even have a proper name. People have called it the "absolute motion operator", which is misleading. I term it the "boundary-relative motion operator", which is a mouthful but is, at least as far as I can tell, accurate. In many contexts, like register assignments, it's an identity operator and does no useful work. The plus sign is more broadly understood to be used for math. Though in its unary form, it too is an identity operator. Notice that Git HEAD's behavior also forbids this perversity: $ ~/groff-HEAD/bin/groff -ww .nr a \w9stuff9 troff: backtrace: file '<standard input>':1 troff:<standard input>:1: error: character '9' is not allowed as a delimiter troff: backtrace: file '<standard input>':1 troff:<standard input>:1: warning: expected numeric expression, got character 's' ...whereas groff 1.23.0 did not. $ ~/groff-stable/bin/groff -ww .nr a \w9stuff9 .tm \na 18080 _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66686> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature