Hi Colin, At 2025-03-26T12:46:18+0000, Colin Watson wrote: > I'd welcome something more robust based on groff, as long as people > remember to consider both sides of the problem (extraction of msgids, > and reassembly of pages using msgstrs).
Yes, I expect that implementing member functions in "src/roff/troff/ node.cpp" to produce output upon running "groff -A pod" would be just the first of two implementation phases. The second would be ensuring that the output is arranged well for interpretation by po4a. I'm attaching (knock wood) "groff -a" output of the ncurses beep(3) man page (because it's short but has enough content to illustrate practical properties of interest) and a hand-made mock-up of envisioned "groff -A pod" output. Notes on the mock-up: 1. I think we can know enough in "node.cpp" to break the output line at sentence boundaries if the additional inter-sentence spacing amount is not zero--and the default is not zero. (Evidence: see recent demonstrations of the `pline` request, which illustrate that this information is carried into the `word_space_size` node type. No man page I know of attempts to override that amount; the ability do do so is a GNU troff extension. 2. I expect that explicit line breaks will be honored (reflected in the output). I don't expect that to be a problem for msgid boundary inference. 3. Representation of the page offset may be erratic and/or inaccurate. I expect msgid extractors to discard leading and trailing whitespace anyway. 4. I don't know how POD quotes/escapes < and > characters; I'll need to learn. 5. At this point in the formatting process, the formatter's notion of a font is an integer referring to a mounting position. We don't know what the font "is". The current font is also a property of the environment, not of nodes per se. But: (a) we know when the font selection _changes_, and (b) for man page formatting I'll bet we can assume that fonts are mounted in traditional order: 1, 2, 3, 4 -> R, I, B, BI.[1] 6. Text in a man page that uses special characters (trout/grout: the "C" command) probably doesn't need to be translated. One exception: as usual we'd likely special-case what "groff -a" renders as `<->` and `<hy>` as good old `-`, and punt (warn on and ignore) any other special character. This approach would probably _not_ be satisfactory for a man page whose "base" language was not English, but as far as I know, no project both does that and supports gettextization of the page. But Helge Kreutzmann would know better than I would. Thoughts? Regards, Branden [1] That isn't _quite_ traditional: "R, I, B, S" is, because in Ossanna troff there was no such thing as a bold-italic typeface--at least not that the formatter knew of as such. But I'm betting we can get away with the slight modification, and bold italics are seldom used in man pages anyway because there's no interface for selecting that style in the macro package. The "BI" font remains a part of the long tail we can capture with this approach nonetheless.
<beginning of page> beep(3NCURSES) Library calls beep(3NCURSES) NAME beep, flash <-> ring the (visual) bell of the terminal with curses SYNOPSIS #include <ncursesw/curses.h> int beep(void); int flash(void); DESCRIPTION beep and flash alert the terminal user: the former by sounding the termi<hy> nal's audible alarm, and the latter by visibly attracting attention. Com<hy> monly, a terminal implements a visual bell by momentarily reversing the character foreground and background colors on the entire display; even a monochrome device can do this. These functions each attempt the other alert type if the one requested is unavailable. If neither is available, curses performs no action. Nearly all terminals have an audible alert mechanism such as a bell or piezoelectric buzzer, but only some can flash the screen. RETURN VALUE These functions return OK on success and ERR on failure. In ncurses, beep and flash return OK if the terminal type supports the cor<hy> responding capability: bell (bel) for beep and flash_screen (flash) for flash. Otherwise they return ERR. EXTENSIONS In ncurses, these functions can return ERR. PORTABILITY X/Open Curses Issue 4 describes these functions. It specifies no error conditions for them. On SVr4 curses, they always return OK, and X/Open Curses specifies them as doing so. HISTORY SVr2 (1984) introduced beep and flash. SEE ALSO ncurses(3NCURSES), terminfo(5) ncurses 6.5 2025-02-01 beep(3NCURSES)
I<beep>(3NCURSES) Library calls I<beep>(3NCURSES) B<NAME> B<beep>, B<flash> - ring the (visual) bell of the terminal with I<curses> B<SYNOPSIS> B<#include <ncursesw/curses.h>> B<int beep(void);> B<int flash(void);> B<DESCRIPTION> B<beep> and B<flash> alert the terminal user: the former by sounding the terminal's audible alarm, and the latter by visibly attracting attention. Commonly, a terminal implements a visual bell by momentarily reversing the character foreground and background colors on the entire display; even a monochrome device can do this. These functions each attempt the other alert type if the one requested is unavailable. If neither is available, I<curses> performs no action. Nearly all terminals have an audible alert mechanism such as a bell or piezoelectric buzzer, but only some can flash the screen. RETURN VALUE These functions return B<OK> on success and B<ERR> on failure. In I<ncurses>, B<beep> and B<flash> return B<OK> if the terminal type supports the corresponding capability: B<bell> (B<bel>) for B<beep> and B<flash_screen> (B<flash>) for B<flash>. Otherwise they return B<ERR>. EXTENSIONS In I<ncurses>, these functions can return B<ERR>. PORTABILITY X/Open Curses Issue 4 describes these functions. It specifies no error conditions for them. On SVr4 I<curses>, they always return I<OK>, and X/Open Curses specifies them as doing so. HISTORY SVr2 (1984) introduced B<beep> and B<flash>. SEE ALSO B<ncurses>(3NCURSES), B<terminfo>(5) ncurses 6.5 2025-02-01 I<beep>(3NCURSES)
signature.asc
Description: PGP signature