Date: Sun, 27 Jan 2019 16:38:34 +0000 From: David Holland <dholland-sourcechan...@netbsd.org> Message-ID: <20190127163833.gb20...@netbsd.org>
| The Unix shell environment is about processing text, and, largely, | processing text in arbitrary ad hoc ways. It fundamentally relies on | being able to treat the user-facing output of arbitrary programs as | machine-readable input. yes. | This puts the goal of customizing user-facing | output to accomodate the user in direct conflict with the goal of | making the shell environment work as intended. I am not sure I agree with that. There is an issue with taking one user's output and processing it by another user, but that in general is a much bigger problem, and a hard one. Eg: lots of tools that we have, which in one way or another deal with "words" (like "wc" and \< in vi, etc) don't work when the language is Thai (and probably others related) where words are distinguished by context, not syntax, and spaces between "words" separate sentences (or just break lines) rather than words. Dealing with those kinds of issues is just plain hard. But ... | One 'solution' is to discourage ordinary users from learning the shell no... | Another 'solution' is to create a separate but equal set no... For the problem you're describing the answer is simply consistency. That is, whatever something means on output, it needs to mean the same for input. It doesn't matter which locale's syntax is chosen, none of them are unconditionally better than any of the others, all that matters is that it is used universally, for the user running the commands (unless they decide to alter it, of course.) We are not at that point yet, and you're right that POSIX is not helping - but nor is it really its job, it is not a legislature, and does not, or should not, be specifying what we have to do ... rather it writes down what actually works, so everyone can agree that a particular command will have some specified effect, and we can rely upon that working everyehere. Then when everyone (more or less) agrees what foo does, we make sure our foo does the same thing -- unless for some reason that's just "wrong" . (This is how the standards get changed, when what once was common is now perceived as incorrect, and implementations start ignoring the standard and changing ... then eventually the standard catches up, during the period of turmoil, it first switches to making whatever it is unspecified, which allows those systems which refuse to do anything non-standard to alter as well, and then eventually the new definition can be standardised.) That is, before POSIX can specify something as being the right way, it needs to actually work in the world first (in at least some reasonable fraction of the systems, ignoring any where something does not work because of what is an obvious bug which simply should be fixed.) This is how we got the current locale mess - it was someone's attempt at dealing with unix in non-ascii environments, and nothing better has yet come along to replace it, despits its flaws. kre