In article <mailman.5715.1329021524.27778.python-l...@python.org>, Chris Angelico <ros...@gmail.com> wrote:
> On Sun, Feb 12, 2012 at 1:36 PM, Rick Johnson > <rantingrickjohn...@gmail.com> wrote: > > On Feb 11, 8:23 pm, Steven D'Aprano <steve > > +comp.lang.pyt...@pearwood.info> wrote: > >> "I have a file containing text. I can open it in an editor and see it's > >> nearly all ASCII text, except for a few weird and bizarre characters like > >> £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an > >> error. What should I do that requires no thought?" > >> > >> Obvious answers: > > > > the most obvious answer would be to read the file WITHOUT worrying > > about asinine encoding. > > What this statement misunderstands, though, is that ASCII is itself an > encoding. Files contain bytes, and it's only what's external to those > bytes that gives them meaning. Exactly. <soapbox class="wise-old-geezer">. ASCII was so successful at becoming a universal standard which lasted for decades, people who grew up with it don't realize there was once any other way. Not just EBCDIC, but also SIXBIT, RAD-50, tilt/rotate, packed card records, and so on. Transcoding was a way of life, and if you didn't know what you were starting with and aiming for, it was hopeless. Kind of like now where we are again with Unicode. </soapbox>
-- http://mail.python.org/mailman/listinfo/python-list