Follow-up Comment #2, bug #65654 (group groff): Bjarni is talking about input, not output. But the example is still slightly confusing, because 0xA0 appears to refer to the Latin-1 character NO-BREAK SPACE (Unicode U+00A0)--but there is no reason to run preconv if the file is in Latin-1 encoding, as groff can read this directly.
Nonetheless, his point remains: many common Unix tools display the characters U+0020 and U+00A0 indistinguishably. But there is no reason for preconv to warn about this. The same issue exists no matter what Unix tool processes input containing both characters. Users may choose to avoid U+00A0 in their input files for this reason, or they may use other strategies to deal with it. It is not preconv's job to police this usage. Users who desire such warnings can write a simple preprocessor (using grep or sed, perhaps) to emit them. Once you start down the rabbit hole of "warn the user about characters that are hard to visually tell apart," where do you stop? In the monospace fonts used in most terminals, you'd be hard-pressed to distinguish U+2012 FIGURE DASH from U+2013 EN DASH. Unicode has a plethora of space-like and dash-like characters. Should preconv warn about all of these? That seems absurd. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?65654> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/