Hi onf,

At 2024-11-03T03:25:01+0100, onf wrote:
> On Sat Nov 2, 2024 at 4:15 AM CET, Dave Kemper wrote:
> > Parsing strings that contain escapes is something that an open groff
> > bug report (http://savannah.gnu.org/bugs/?62264) seeks to make more
> > consistent.  Does the mechanism proposed in that report sound like
> > it would solve the problem you're facing?  (By default, the savannah
> > bug tracker displays comments from newest to oldest, but the thread
> > will be easier to follow if you click the "Reverse comment order"
> > button in the "Discussion" section.)
> 
> I am glad to hear it's a known issue/limitation that's being worked
> on.  Adding a string iterator would fix this, although it would make
> my code significantly more complex as I would have to compare the
> strings character by character (my has-prefix macro can handle
> arbitrarily long prefixes, and I also have a has-suffix macro which
> would be even worse).
One reason to have the string(/macro/diversion) iterator request is that
as soon as do, we can use it to construct a "string library" macro
package.  "string.tmac" seems like a likely name.

What I envision is removing several of the string-handling requests from
GNU troff and replacing them with macros in "string.tmac".

        .length
        .chop
        .substring
        .stringdown
        .stringup

"string.tmac" would also be a useful place to experiment with things
like:

        .strchr
        .strrchr
        .index
        .rindex
        .slice (return a substring using Python-esque indexing)

And maybe AWK-like replacement macros:

        .sub
        .gsub

The nice thing about having these in a macro file is that doing so pours
somewhat less cement over them than having them in the formatter does.
We could, if we wished, apply semantic versioning to our "string
library".

> A problem with this solution is that it's incomplete. It addresses a
> particular issue arrising from troff's usage of macro substitution,
> but doesn't solve the others. For instance, I would still run into
> issues if I tried to compare a literal ' against anything and
> delimited the comparands by the same character, which can happen with
> the proposed iterator mechanism:
>   .ie '\\*[ch]'"' \" ...

We can't verify or refute that claim until the code is in place, but I
expect that you are wrong about this, unless you run the formatter in
AT&T compatibility mode (in which case the syntax `\*[ch]` won't work
anyway).

info '(groff) Compatibility Mode':

     Normally, GNU 'troff' preserves the interpolation depth in
  delimited arguments, but not in compatibility mode.

       .ds xx '
       \w'abc\*(xxdef'
           => 168 (normal mode on a terminal device)
           => 72def' (compatibility mode on a terminal device)

>   $ groff -b -ww -z
>   .ds str "'\"
>   .ie '\*[str]'\'' .tm groff: single quote
>   .el .tm groff: else
>   groff: else

This fails because `\` does not escape the apostrophe the way you think
it does.  `\'` is a special character escape sequence.

groff(7):

Escape sequence short reference
...
       \'     is a synonym for \[aa], the acute accent special
              character.

You'll need to do this a slightly different way when attempting to match
a character that happens to be the same as the delimiter in a formatted
output comparison.  One layer of indirection will do.

$ groff <<EOF
##> .ds needle '
.ds haystack '
.ie '\*[needle]'\*[haystack]' .tm it matches
.el .tm no match
##> EOF
it matches

Regards,
Branden

Attachment: signature.asc
Description: PGP signature

Reply via email to