Follow-up Comment #4, bug #65710 (group groff): [comment #3 comment #3:] > Bjarni's prescription was much too strong.
Bjarni's "prescription" was an editorial recommendation to users that had no bearing on his proposed code change. Users would be free to ignore it. The current proposal--to upgrade his proposed warning to a fatal error--only intensifies the problems. As I said over there, preconv should not be in the business of policing which parts of valid Unicode users use. The new proposal kicks that policing from a written warning up to jail time. It's wildly out of proportion with the offense. > _Maybe_ they mean `\ ` (an unadjustable space). It's > impossible to know, which is why they should disambiguate it. It's technically ambiguous, in the same way that "We're going to spend the week in a cabin" is technically ambiguous: that person could be talking about either a house in the woods or the interior of an airplane. But anyone hearing it will know exactly what they mean--crucially, because if they did mean the latter, they'd specifically note it. When you say something mildly ambiguous, but with one meaning far more likely than another, it's only the exceptional meaning that tends to need to be disambiguated. Even the roff language--hardly a paragon of DWIM design--understands this. You need only say ".sp 4" to space down four lines; you don't have to specify "4v" because roff gives the request a sensible default unit. Likewise, \~ is the sensible default meaning for U+00A0: in almost all normal situations, the user will want \~. The documentation clearly spells out how to get different {units for .sp / types of nonbreaking spaces}, so users who want the rarer \space in certain places can explicitly say so. Making the user edit a bunch of valid Unicode characters (or valid ISO 8859-1, or 8859-2, or any other encoding in the ISO 8859 family) only impedes preconv's ability to import text from another source and use it directly. We should be making this easier for users, not putting up needless roadblocks in the name of semantic purity, certainly not without wider discussion. > And if there are multiple U+00A0 characters in sequence, the > author might be better off supplying a `\h` sequence to > express what it is they want, precisely. Sure. But the formatter allows \~\~\~\~\~ without complaint, and adding a complaint here is beyond what this ticket is proposing, so this is tangential. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?65710> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/