On Thu, Jun 1, 2023 at 8:23 AM Bruno Haible <br...@clisp.org> wrote:
> Hi Mike, > > > It looks right but I do see 3 warnings: > > > > :troff: man7/groff_char.7:1051: warning: can't find special character > 'bs' > > troff: man7/groff_char.7:1192: warning: can't find special character > > 'radicalex' > > troff: man7/groff_char.7:1195: warning: can't find special character > > 'sqrtex' > > I get these warnings too, on a GNU system; so, you can ignore them. > > > For example, in the Arrows section I see: > > > > Arrows > > > > l l l l lx. Output Input PostScript Unicode Notes _ > > ← \[<-] arrowleft u2190 horizontal arrow left + > > → \[->] arrowright u2192 horizontal arrow right + > > ↔ \[<>] arrowboth u2194 T{ horizontal arrow in both > direc‐ > > tions T} ↓ \[da] arrowdown u2193 vertical arrow down + > > ↑ \[ua] arrowup u2191 vertical arrow up + > > ↕ \[va] arrowupdn u2195 T{ vertical arrow in both > > directions > > T} ⇐ \[lA] arrowdblleft u21D0 horizontal double arrow > > left > > ⇒ \[rA] arrowdblright u21D2 horizontal double arrow > right > > ⇔ \[hA] arrowdblboth u21D4 T{ horizontal double > arrow in > > Nice! That's how it's supposed to be (in an environment that can display > Unicode). > > > and this change being discussed is what gets me to proper UTF-8 rendering > > (although perhaps there is a better way to fix this) > > That's what I claim. > > When I run the groff command in different locales: > $ LC_ALL=de_DE.UTF-8 groff -Tutf8 -mandoc man7/groff_char.7 > ~/out1 > $ LC_ALL=de_DE.ISO-8859-1 groff -Tutf8 -mandoc man7/groff_char.7 > ~/out2 > the output files ~/out1 and ~/out2 are identical. > > Therefore, once the option -Tutf8 has been passed to groff, the locale's > encoding is irrelevant. > > When you run "man groff_char", these pieces of software are involved: > A) man > B) the gnulib parts included in 'man' > C) groff > D) the gnulib parts included in 'groff' > > The experiment above shows that C) and D) don't need changes. > > I believe the fix needs to be in A), not B). > > It is likely that A) does a call to nl_langinfo(CODESET) or > locale_charset(), > to decide which options to pass to groff and potentially whether to call > iconv. This is perfectly normal, because when the console / xterm / > terminal > can only display ISO-8859-1 characters, it would be wrong if 'man' sent > arbitrary Unicode characters to the console. > > So, the questions are: > 1) How is it possible that on z/OS most of the ASCII-based software forms > an ISO-8859-1 environment, yet the UTF-8 encoded groff output displays > just fine? > 2) How to teach 'man' about this particular environment? > Thanks for the detailed response. I will dig in. > > Bruno > > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > is there a way to get gmail to change it's default 😁