Hi onf, At 2024-11-03T03:25:01+0100, onf wrote: > On Sat Nov 2, 2024 at 4:15 AM CET, Dave Kemper wrote: > > Parsing strings that contain escapes is something that an open groff > > bug report (http://savannah.gnu.org/bugs/?62264) seeks to make more > > consistent. Does the mechanism proposed in that report sound like > > it would solve the problem you're facing? (By default, the savannah > > bug tracker displays comments from newest to oldest, but the thread > > will be easier to follow if you click the "Reverse comment order" > > button in the "Discussion" section.) > > I am glad to hear it's a known issue/limitation that's being worked > on. Adding a string iterator would fix this, although it would make > my code significantly more complex as I would have to compare the > strings character by character (my has-prefix macro can handle > arbitrarily long prefixes, and I also have a has-suffix macro which > would be even worse).
One reason to have the string(/macro/diversion) iterator request is that as soon as do, we can use it to construct a "string library" macro package. "string.tmac" seems like a likely name. What I envision is removing several of the string-handling requests from GNU troff and replacing them with macros in "string.tmac". .length .chop .substring .stringdown .stringup "string.tmac" would also be a useful place to experiment with things like: .strchr .strrchr .index .rindex .slice (return a substring using Python-esque indexing) And maybe AWK-like replacement macros: .sub .gsub The nice thing about having these in a macro file is that doing so pours somewhat less cement over them than having them in the formatter does. We could, if we wished, apply semantic versioning to our "string library". > A problem with this solution is that it's incomplete. It addresses a > particular issue arrising from troff's usage of macro substitution, > but doesn't solve the others. For instance, I would still run into > issues if I tried to compare a literal ' against anything and > delimited the comparands by the same character, which can happen with > the proposed iterator mechanism: > .ie '\\*[ch]'"' \" ... We can't verify or refute that claim until the code is in place, but I expect that you are wrong about this, unless you run the formatter in AT&T compatibility mode (in which case the syntax `\*[ch]` won't work anyway). info '(groff) Compatibility Mode': Normally, GNU 'troff' preserves the interpolation depth in delimited arguments, but not in compatibility mode. .ds xx ' \w'abc\*(xxdef' => 168 (normal mode on a terminal device) => 72def' (compatibility mode on a terminal device) > $ groff -b -ww -z > .ds str "'\" > .ie '\*[str]'\'' .tm groff: single quote > .el .tm groff: else > groff: else This fails because `\` does not escape the apostrophe the way you think it does. `\'` is a special character escape sequence. groff(7): Escape sequence short reference ... \' is a synonym for \[aa], the acute accent special character. You'll need to do this a slightly different way when attempting to match a character that happens to be the same as the delimiter in a formatted output comparison. One layer of indirection will do. $ groff <<EOF ##> .ds needle ' .ds haystack ' .ie '\*[needle]'\*[haystack]' .tm it matches .el .tm no match ##> EOF it matches Regards, Branden
signature.asc
Description: PGP signature