On Feb 11, 8:23 pm, Steven D'Aprano <steve +comp.lang.pyt...@pearwood.info> wrote: > On Sun, 12 Feb 2012 12:28:30 +1100, Chris Angelico wrote: > > On Sun, Feb 12, 2012 at 12:21 PM, Eric Snow > > <ericsnowcurren...@gmail.com> wrote: > >> However, in at > >> least one current thread (on python-ideas) and at a variety of times in > >> the past, _some_ people have found Unicode in Python 3 to make more > >> work. > > > If Unicode in Python is causing you more work, isn't it most likely that > > the issue would have come up anyway? > > The argument being made is that in Python 2, if you try to read a file > that contains Unicode characters encoded with some unknown codec, you > don't have to think about it. Sure, you get moji-bake rubbish in your > database, but that's the fault of people who insist on not being > American. Or who spell Zoe with an umlaut.
That's not the worst of it... i have many times had a block of text that was valid ASCII except for some intermixed Unicode white-space. Who the hell would even consider inserting Unicode white-space!!! > "I have a file containing text. I can open it in an editor and see it's > nearly all ASCII text, except for a few weird and bizarre characters like > £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an > error. What should I do that requires no thought?" > > Obvious answers: the most obvious answer would be to read the file WITHOUT worrying about asinine encoding. -- http://mail.python.org/mailman/listinfo/python-list