On Tuesday, May 13, 2014 6:48:35 AM UTC+5:30, Steven D'Aprano wrote: > On Mon, 12 May 2014 17:47:48 +0000, alister wrote: > > > Surely those example programs are not the pythonoic way to do things or > > am i missing something? > > > > Feel free to show us your version of "cat" for Python then. Feel free to > target any version you like. Don't forget to test it against files with > names and content that: > > > - aren't valid UTF-8; > > > - are valid UTF-8, but not valid in the local encoding.
Thanks for a non-defensive appraisal! > > > > if those code samples are anything to go by this guy makes JMF look > > sensible. > > > > Armin Ronacher is an extremely experienced and knowledgeable Python > developer, and a Python core developer. He might be wrong, but he's not > *obviously* wrong. > > > > Unicode is hard, not because Unicode is hard, but because of legacy > problems. I can create a file on a machine that uses ISO-8859-7 for the > file name, put JShift-JIS encoded text inside it, transfer it to a > machine that uses Windows-1251 as the file system encoding, then SSH into > that machine from a system using Big5, and try to make sense of it. If > everybody used UTF-8 any time data touched a disk or network, we'd be > laughing. It would all be so simple. I think the most helpful way forward is to accept two things: a. Unicode is a headache b. No-unicode is a non-option > > > > Reading Armin's post, I think that all that is needed to simplify his > Python 3 version is: > > > > - have a bytes version of sys.argv (bargv? argvb?) and read > the file names from that; > > - have a simple way to write bytes to stdout and stderr. > > > Most programs won't need either of those, but file system utilities will. About the technical merits of Armin's post and your suggestions, Ive nothing to say, since I am an ignoramus on (the mechanics of) unicode [Consider me an eager, early, ignorant adopter :-) ] Its however good to note that unicode is rather unique in the history not just of IT/CS but of humanity, in the sense that no one (to the best of my knowledge) has ever tried to come up with an all-encompassing umbrella for all humanity's scripts/writing systems etc. So hiccups and mistakes are only to be expected. The absence of these would be much more surprising! -- https://mail.python.org/mailman/listinfo/python-list