At 2024-03-22T17:06:40-0700, Russ Allbery wrote: > "G. Branden Robinson" <g.branden.robin...@gmail.com> writes: > > > That's a good argument against grotty(1) emitting overstriking > > sequences, at least by default, and yet that the people swiftest to > > anger on this subject argue _for_ it. > > I'm not fully following this argument, but (assuming I've not > completely lost the train of conversation), it may be relevant here > that some years ago (it was in 2000, which surely was only five or six > years ago) a contributor went to the trouble of writing > Pod::Text::Overstrike to format POD output with backspacing with > overstrike or underscores. At the time, a version that used termcap > already existed (and still does).
You prompted me to take a look around at podlators Git. I didn't have any idea this existed. Neat! > The stated reason was that the output was device-independent, unlike > output that embeds formatting codes derived from device-specific > termcap entries, Okay...by this time groff had for about 10 years been producing device-independent _terminal_ output from troff(1). On the other, that is its own peculiar little language. Maybe the author just didn't want to deal with *roff, or didn't want to count on GNU troff being available. (Kernighan didn't completely unify terminals under the device-independent troff scheme presented in CSTR #54--nevertheless its "driving tables" for terminal devices bore a startling resemblance to "DESC" files for typesetting devices.) > and they really liked the bold and underlining rather than the plain > text or *ad hoc* markup produced by Pod::Text. Part of me wants to yell "then why not just use nroff, for crying out loud", but part of me understands the fun of finding one's own way. > I know that to a first approximation all the world is now some > variation of an imaginary VT100 terminal emulator, and thus one can > usually blindly use SGR escape sequences and expect them to work in > much the way that one can assume all programs only run on VMS. I think that's a little unfair. We can trace the history of these escape sequences back to ANSI X3.64, which was later succeeded by ECMA-48 and (equivalently, as far as I know) ISO 6429 and JIS X 0211. These standards have been around approximately as long as Unix has been something you were likely to run into at your university or workplace. I would never advocate _blind_ usage of SGR or other ECMA-48 escape sequences. For SGR in particular, even termcap has a capability code: "sa". Programs, including GNU Bash and those in GNU Coreutils, _should_, in my opinion, be using termcap or (preferably) terminfo to look before they leap. But the cryptic form of ECMA-48 escape sequences has proven seductive to junior hackers (in mentality, if not always chronological age) far and wide. As soon as they can make the terminal jump with "printf '\e[xx;yy\a'" they get completly carried away. Often, the next day, the same person will, in a code review, confidently and with no sense of irony, accuse your work of a "layering violation". Really, there isn't a hand large enough to slap these people with. But that's not the fault of ECMA-48, which has even had the virtue of being freely available on the Web for many years. We cannot say as much for many ANSI or ISO/IEC standards. > But I have occasionally had reports that Pod::Text::Overstrike is a > better option for (some) Windows users because apparently their pager > handled the overstriking but termcap (via the Perl Term::Cap module) > wasn't available. I'm no MS-DOS/Windows expert, but my understanding is that you couldn't count on support for ECMA-48 at the DOS prompt (or equivalently in CMD.EXE on NT-descended Windows) because the console driver didn't recognize them. However, if the user told CONFIG.SYS to load ANSI.SYS, it would, because that module interposed itself before the BIOS call that talked to the display, and interpreted them, driving the CGA/EGA/VGA hardware appropriately. God, I feel dirty talking about this crap. I'm sorry I remember even that much. I have gathered, by reading bug fora and similar while trawling the Internet for accounts of trouble with groff that people are too lazy to actually report to us, that Windows 10 or 11 has a console driver/terminal emulator that does "better" with ECMA-48 support. I haven't heard even a rumor of anything usefully quantitative, like a table of its support for standardized escape sequences in comparison with, say, xterm, or even the Linux kernel's somewhat wobbly virtual console device. But, supposedly, things are "better". > I have no idea how dated this information is, having not used Windows > myself in several decades, but I always found it interesting. I've > kept the module working all these years since it's not much additional > effort. No crime in that. I keep a lot of ancient groff stuff in service too. At 2024-03-22T21:08:39-0500, Dan Plassche wrote: > Overstrikes are more easily filtered and transformed for other output > formats than levels of nested escape codes that are terminal specific. ...yes, except when they're inherently ambiguous. grotty(1): ... grotty overstrikes, representing a bold character c with the sequence “c BACKSPACE c”, an italic character c with the sequence “_ BACKSPACE c”, and bold italics with “_ BACKSPACE c BACKSPACE c”. This rendering is inherently ambiguous when the character c is itself the underscore. A bold-italic font was a pretty exotic thing to Bell Labs troff--so much so that it didn't exist. CSTR #54 [1976] documents four fonts, roman, italic, bold, and "special": R, I, B, S. (Hmm, now I'm hungry.) I figure this explains why the ambiguity never troubled them. "BI" fonts can, it seems, largely be traced to the impact of PostScript and its base 14 fonts, which had 3 families in 4 styles and two symbol fonts. > Enscript from Adobe, and the more featureful GNU replacement, are good > examples of tools designed to work with nroff or other daisywheel/line > printer output using overstrikes. The preformatted line and page > layout are fully retained with all overstrikes rendered properly and > the ability to use any font (converted) in the postscript output, > which is awesome for printing historical documents designed for nroff. > You can also easily pass custom roff overstrikes to simulate combined > typewriter characters beyond bold and underline. Yes. And accounts I've heard indicate that this was done even on video terminals. Not the character-cell sort, but the storage-tube displays of the Tektronic 4014 and similar. Seventh Edition Unix shipped a tc(1) command to help you preview your troff output with that device before you spent precious departmental money sending it to the actual typesetter. > I have no major objection to using escape sequences and agree they > open some additional possibilities for functionality in modern > terminal emulators. However, I think that most people using > overstrikes have less as the pager in raw mode where underlines and > bold display correctly for manual pages. Yes, and this feature of less(1) has so badly misled people that they think it's what nroff should have been producing all along. But if you used nroff at the Labs in 1978, on your Teletype terminal, you didn't have to pipe nroff output through anything. It produced correct markup appropriate to your terminal directly. That's what it should have _kept_ doing, but for the damnable corporate and human factors I explored earlier in the thread. You shouldn't _have_ to page a program's output to _get_ correct output. If someone's not convinced yet, or is simply entertained by seeing me fulminate about this, here you go: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=312935 > It's a shame that early pc vga consoles did not display underlines or > italics properly! Very much for the former. A lack of proper italics, at the resolutions we were using for text mode fonts in those days, I find more excusable. Even now, in an Xft-leveraging, client-side-font-rendering xterm, a rectangle is a pretty tight squeeze for an italic capital M. I live with it. I wonder if anyone has attacked this problem by writing a terminal widget that renders glyphs into parallelograms. > Most other *nix platforms did, and that's really not a problem in X or > modern graphical consoles like wscons on NetBSD that display > overstrikes correclty. With correct terminfo(3) descriptions, these should Just Work when I've finished merging/doing violence to Lennart's grotty-terminfo patch. I expect to keep grotty's `-c` option and `GROFF_NO_SGR` environment variable support around forever. Not just because the sort of people I complain about above will pool their Bitcoins to hire a ninja assassin to kill me if I don't, but because it will remain important, for regression testing, to simulate "old school" output without having to tediously manage chroots, VMs, or ancient packages. Regards, Branden
signature.asc
Description: PGP signature