Hi, James K. Lowden wrote on Sat, Jun 04, 2022 at 03:23:36PM -0400: > On Thu, 5 May 2022 03:40:27 -0500 Dave Kemper wrote:
>> To cite the example that originally launched this thread, the old >> docs termed the \& a "zero width space," which Branden has changed to >> the "non-printing input break." It may not roll off the tongue as >> easily, but it's more precise and descriptive about what the escape >> does: it affects how input is parsed, not how output is rendered. >> It's not kin to other space escapes like \~ or \|, as the original >> term implied. > I disagree. That's not what it does. > > The zero width space does not "affect how input is parsed". It's > parsed like all other input -- indeed, exactly like \| and \~. Its only > distinction from them is on output. > > To insert \& at the start of a line does not affect how the input is > parsed. *Any* character before a leading dot prevents the dot from > being interpreted as a request. The salient difference is that \& > introduces nothing into the output stream. Hence, "zero width". > > To me, the term "non-printing input break" verges on nonsense because > it suggests there might be such a thing as "printing input". There is > not: input is processed and rendered as output. Input is no > more printed than it is written to the keyboard. > > I humbly suggest on this point we return to status quo ante. A "zero > width space" is perfectly clear terminology. The fact that \& is used > occasionally to prevent non-requests from being interpreted as requests > is incidental, easily explained and understood. Does anyone remember > being confused by it? I don't. James' argument makes sense to me. On top of that, the groff documentation uses the term "break" very consistently, defining it as starting a new output line even though the current output line is not yet full. In a few places, the term "line break" is used as a synonym of "break", which, in my humble opinion, is accurate and does not cause confusion. The escape sequence "\:" is called "non-printing break point", the "'" control character "no-break control character", which both agree well with the way "break" is used. Occasionally, a break is qualified as a "page break" or "column break" which is fine in so far as every page break and every column break implies an (output line) break in the usual sense, too. The groff documentation also uses the term "non-breaking" = "unbreakable" consistently, meaning that a break will not be inserted at the place in question. In very few cases, the groff documentation uses the term "break the input line" to mean "start a new input line". There is a small risk that might cause confusion with "breaks" in the normal sense, but i see no general way to avoid that risk. In any case, all such places i saw clearly use the qualifier "input", so careful readers should not get confused. So, to summarize, groff documentation consistently uses the word "break" for "line break", almost always in the sense of output line break and in a few clearly qualified cases for "input line break". >From this perspective, it is indeed unfortunate terminology to call \& a "non-printing input break" because it has no relation whatsoever to breaking the input line, nor to a "break" in the general sense, i.e. breaking the output line. I do realize the change was committed on Sat Aug 15 22:08:01 2020, nearly two years ago, but when issues aren't noticed soon, finding them later is still better than never. More constructively, how *should* it be called? In all ways i'm aware of, it behaves exactly like a horizontal spacing escape sequence (except that its width is zero) and exactly like a character (except that it prints an empty glyph of witdh zero). So both "zero-width space" and "non-printing zero-width character" would seem accurate to me. The former has the advantage of being shorter and agreeing with traditional terminology. It's slightly unfortunate that Unicode uses the character name "ZERO WIDTH SPACE" for what groff (more appropriately) calls the "non-printing break point" (\:), but i would consider consistency within the roff domain more important than using the same terms as Unicode. Consequently, i'm *not* advocating calling \& a "zero-width non-joiner" or a "zero width no-break space" even though both would be more precise if we were aiming for Unicode-compatible terminology. Then again, if people worry a lot about U+200B, then calling it a "zero width no-break space" is still much better than calling it some kind of a "break". The argument "it is not a space because it doesn't move and it is not a character because it doesn't print anything" reminds me a bit of the argument "0 is not a number because there is nothing there", yet mathematicians certainly call it a number all the same because zero can be used in the same way ways as a number. The reason why \& works for the escaping purposes it is used for is quite similar: it is treated as if it were a space or character except that it doesn't print nor move. In all these cases, you can do the same escaping with some other spacing escape sequence or with some other character if you don't object to moving or printing a bit. So, i'd say, let's call it a day (err, a space). It certainly is *not* a break in any of the senses familiar from the groff documentation. Yours, Ingo P.S. Note that i'm not saying Branden is making our documentation worse, quite to the contrary. This looks like an ususual slip to me.