Update of bug #62264 (project groff): Summary: string iteration handles escape sequences inconsistently => string iteration handles escape sequences inconsistently (want `for` request)
_______________________________________________________ Follow-up Comment #1: Here are some notes from a private email I sent to Alex Colomar and Deri James last December. That bigger fish arises from the observation that seemingly everywhere we prepare groff strings/macros/diversions for handoff to a device control escape sequence, we end up with some new variation on bespoke logic to tediously walk a string (more or less). These sequences of code are long and surely not easy for the novice to understand. So one of the things I want to look into for groff 1.24 is giving the language an actual string iterator--a `for` request. And a couple of new conditional expression operators to perform tests on the items returned by that iterator. Some background for this is in <https://savannah.gnu.org/bugs/?62264> ("string iteration handles escape sequences inconsistently"). Here's the idea. .di div Here's my \f[CI]crazy\f[] diversion! .di . .ds div*scrubbed \" empty . .for ch \*[div] \{\ . if !N \*[ch] .as div*scrubbed \n*[ch] . if '\*[ch]'@' .break . if e .continue . \" some crazy stuff you do only on odd pages .\} The above is really contrived, but the idea is to communicate as much of the semantics as I think we could want. 1. No messing with `length` or `substring` operations. 2. Address #62264. Document that string iteration can hand you back any of (A) a Basic Latin character; (B) a special character; or (C) a "node" (like a type face or size changing operation, but the details aren't important as its "formatty" stuff, not "plain text" stuff). 3. A new 'N' conditional expression operator tests string contents for node identity. I don't know whether this should test just the first element of the string or scan the whole thing. In the example above, it doesn't matter--`for` guarantees that the `ch` string is a singleton. Giving the *roff programmer a way to cope with this is the correct way to solve this old chestnut. can't transparently output node at top level 4. Need to decide whether the string `ch` is left defined after the for loop exits. 5. You should be able to `break` or `continue` a `for` loop just as you can a `while` loop. 6. A lingering issue is our other old friend. can't translate character code 233 to special character ''e' in transparent throughput This is <https://savannah.gnu.org/bugs/?63074>. Here's another use case, way less hypothetical. It's some Deri magic. .\" Remove '\%' from string used as bookmark destination .de an*cln . ds \\$1 . als an*cln:res \\$1 . shift . ds an*cln:res \\$*\" . ds an*cln:char \\*[an*cln:res] . substring an*cln:char 0 0 . if '\\*[an*cln:char]'\%' .substring an*cln:res 1 . rm an*cln:char .. Here's how you'd do it with `for`. .ds output \" empty . .for ch \*[input] \{\ . if !'\*[ch]'\%' .as output \*[ch] .\} As syntactic sugar goes, I'd say that enables considerable slimming. This would probably also compel us to clear up our documentation (and our thinking) a lot with respect to what's really a "character" in groff. \- is. \% is. \f isn't (if the remainder is well-formed, it becomes a node). What about the "leader character" (Ctrl+A)? Or the uninterpreted leader character \a? Many of these things have the word "character" in their names but, for example, you can't test them with ".if c". Consider: .ds string \&\% .ds char \*[string] .substring char 0 0 .if c \*[char] .tm it is a character troff:<standard input>:6: error: expected ordinary or special character, got an escaped '&' .ds char \*[string] .substring char 1 1 .if c \*[char] .tm it is a character troff:<standard input>:9: error: expected ordinary or special character, got an escaped '%' Keith Marshall wrote an entire macro file to deal with this sort of thing.[1] Maybe consideration of these issues is affecting my priorities and making me want to get over the release hump so I can work on them. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?62264> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/