My mistake. I found the word "entity" in groff.7, not groff_char.7. Nevertheless, thanks for the revised History section of groff_char.7. It's a far more definitive account than I could give from my own memory.
Doug On Sat, Apr 1, 2023 at 9:22 PM G. Branden Robinson <g.branden.robin...@gmail.com> wrote: > > Hi Doug, > > At 2023-04-01T19:45:19-0400, Douglas McIlroy wrote: > > I went to see what this proposal meant and ran into undefined jargon > > in groff_char.7. > > This, and phrases like "in the actual version", are regrettable defects > in the groff 1.22.4 version of this man page. > > The one in the groff 1.23.0.rc2 and .rc3 release candidates does not > have them. This page is one that I've heavily revised. I'm attaching a > copy for your consideration. I'd particularly welcome your comments on > the new "History" section. > > > Yes, info groff probably tells me more than I want to know. Still, I > > expect the man page to be terse, but intelligible. > > Fair. I hope the intelligibility of the present form is improved. > > > What's an "entity"? > > Suggestive of conceptual fuzziness on the part of the writer, I would > propose. But I can't blame them; the difficulty of comprehending > groff's flexible and complex character to glyph transformation process > is the main reason I have not yet revised that part of our Texinfo > manual. > > > Fortunately, Dave Kemper's post shed light on this question. > > > > The first use of .char that came to mind was > > .char \[ntilde] \o'n~' > > which would collide badly with the following ancient trick for > > unbreakable, unpaddable space. (Ignore the question of whether the > > tilde at hand is usable as a diacritical.) > > .tr ~ > > a~b~c > > You may be one of a dwindling number of people for whom that ancient > trick comes to mind. :) But we do continue to support it, and I see no > reason to withdraw it. > > > This, I guess, is typical of the motivation for the change. > > I was spurred into this by noticing a problem last July with what I > think was a historical troff document. I can't lay my hands it now, but > the following short example suggests the issue. > > $ cat EXPERIMENTS/tr-in-env.roff > .nf > .tr ab > bab > .ev 1 > bab > .br > .ev > bab > .pl \n(nlu > > This produces 3 lines of "bbb". > > The problem I observed, as best I can recall, was that a document > temporarily used `tr` to make input more convenient. > > The trouble was, the same character they were translating turned up in > one of their page headers or footers. > > So, depending on how the document got modified and the resulting > placement of the `tr`-ed material, the headers/footers might get > corrupted or might not. > > A lengthier, but contrived, example of this is at > <https://savannah.gnu.org/bugs/?62691>. > > I suppose there are workarounds one could coach the user to undertake in > such a situation, but once I got to thinking about it, it struck me that > there should be a cleaner division of responsibility between `tr` and > `char`. > > My suggestion is twofold: (1) that `tr` should be used for permuting > what we can term groff's internal character set; meaning the 94 > printable characters of ASCII/Basic Latin, and whatever special > characters happen to be defined; and (2) `char` and `rchar` are for > adding and removing members of the set of special characters. (You can > try to `rchar` an ordinary Basic Latin character; it will silently fail. > I mean to make that no longer silent.[1]) > > It is necessary to consider the impact of these processes on diversions. > I don't presently think my proposal is disruptive to the status quo in > that respect. When a diversion is populated, special character > definitions are already resolved, and just as with string > interpolations, using the `unformat` request does not recover their > original forms. > > Illustration (with groff 1.22.4): > > $ cat EXPERIMENTS/char-in-a-diversion.groff > .nf > .char \[zz] FNORD > .di XX > You didn't \[zz] this. > .di > Hello, world. > diverted XX: \c > .XX > .unformat XX > unformatted XX: \*[XX] > .pl \n[nl]u > $ nroff -Tascii EXPERIMENTS/char-in-a-diversion.groff > Hello, world. > diverted XX: You didn't FNORD this. > unformatted XX: You didn't FNORD this. > > $ > > > Suppose the change isn't made? What does .char do for you that .ds > > doesn't? Certainly nothing essential in the example above. However, it > > can avoid the ugliness of string invocations. > > I don't remember where I saw this trick, but you can use a > `char`-defined object as a margin character, and I suppose just about > anywhere else the language syntax is accepting of an atomic character. > The utility of this comes in when realizing that someone might > reasonably want to set a margin character in a particular typeface > (maybe it's a dingbat--most of these don't have special character names) > and/or in a certain color. > > Recasting the language of the 1.22.4 Texinfo manual, `char` is described > as doing this to the RHS of its definition: "[the RHS] is processed in a > temporary environment and the result is wrapped up into a single object. > Compatibility mode is turned off and the escape character is set to '\' > while [it] is being processed. Any emboldening, constant spacing or > track kerning is applied to this object rather than to individual > characters in [it]." > > > I regard the potential benefit mentioned in the last sentence as > > unpersuasive, but the potential catastrophe of the initial example as > > tilting the scales toward the proposal. > > I think it would help distinguish and orthogonalize the language if > `char` character definitions remained global to formatter state, and > translations/transliterations with `tr` became properties of the > environment. > > I suppose roff veterans are used to it, but my mind twists even when > looking at my own example in Savannah #62691 (linked above). > > Namely, > > .tr @--@ > > is not a no-op! In fact, it works a lot like file descriptor > redirections in the shell. > > foo >/dev/null 2>&1 | grep error > > Each left-hand member of a `tr` translation pair identifies a place in > the translation "from" space, and each right-hand member a place in the > "to" space. The transform is then done atomically. On occasions when I > want to send throw standard output away but grep the standard error > stream, I haltingly think through this same issue. > > Regards, > Branden > > [1] https://savannah.gnu.org/bugs/?63985