Hi Mike, > It looks right but I do see 3 warnings: > > :troff: man7/groff_char.7:1051: warning: can't find special character 'bs' > troff: man7/groff_char.7:1192: warning: can't find special character > 'radicalex' > troff: man7/groff_char.7:1195: warning: can't find special character > 'sqrtex'
I get these warnings too, on a GNU system; so, you can ignore them. > For example, in the Arrows section I see: > > Arrows > > l l l l lx. Output Input PostScript Unicode Notes _ > ← \[<-] arrowleft u2190 horizontal arrow left + > → \[->] arrowright u2192 horizontal arrow right + > ↔ \[<>] arrowboth u2194 T{ horizontal arrow in both direc‐ > tions T} ↓ \[da] arrowdown u2193 vertical arrow down + > ↑ \[ua] arrowup u2191 vertical arrow up + > ↕ \[va] arrowupdn u2195 T{ vertical arrow in both > directions > T} ⇐ \[lA] arrowdblleft u21D0 horizontal double arrow > left > ⇒ \[rA] arrowdblright u21D2 horizontal double arrow right > ⇔ \[hA] arrowdblboth u21D4 T{ horizontal double arrow in Nice! That's how it's supposed to be (in an environment that can display Unicode). > and this change being discussed is what gets me to proper UTF-8 rendering > (although perhaps there is a better way to fix this) That's what I claim. When I run the groff command in different locales: $ LC_ALL=de_DE.UTF-8 groff -Tutf8 -mandoc man7/groff_char.7 > ~/out1 $ LC_ALL=de_DE.ISO-8859-1 groff -Tutf8 -mandoc man7/groff_char.7 > ~/out2 the output files ~/out1 and ~/out2 are identical. Therefore, once the option -Tutf8 has been passed to groff, the locale's encoding is irrelevant. When you run "man groff_char", these pieces of software are involved: A) man B) the gnulib parts included in 'man' C) groff D) the gnulib parts included in 'groff' The experiment above shows that C) and D) don't need changes. I believe the fix needs to be in A), not B). It is likely that A) does a call to nl_langinfo(CODESET) or locale_charset(), to decide which options to pass to groff and potentially whether to call iconv. This is perfectly normal, because when the console / xterm / terminal can only display ISO-8859-1 characters, it would be wrong if 'man' sent arbitrary Unicode characters to the console. So, the questions are: 1) How is it possible that on z/OS most of the ASCII-based software forms an ISO-8859-1 environment, yet the UTF-8 encoded groff output displays just fine? 2) How to teach 'man' about this particular environment? Bruno A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?