On Thu, Jan 29, 2009 at 11:24 AM, Anjanesh Lekshminarayanan < m...@anjanesh.net> wrote:
> Im reading a file. But there seems to be some encoding error. > > >>> f = open(filename) > >>> data = f.read() > Traceback (most recent call last): > File "<pyshell#2>", line 1, in <module> > data = f.read() > File "C:\Python30\lib\io.py", line 1724, in read > decoder.decode(self.buffer.read(), final=True)) > File "C:\Python30\lib\io.py", line 1295, in decode > output = self.decoder.decode(input, final=final) > File "C:\Python30\lib\encodings\cp1252.py", line 23, in decode > return codecs.charmap_decode(input,self.errors,decoding_table)[0] > UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position > 10442: character maps to <undefined> > > The string at position 10442 is something like this : > "query":"0 1Ȉ \u2021 0\u201a0 \u2021Ȉ "," > > So what encoding value am I supposed to give ? I tried f = > open(filename, encoding="cp1252") but still same error. I guess > Python3 auto-detects it as cp1252 It does auto-detect it as cp1252- look at the files in the traceback and you'll see lib\encodings\cp1252.py. Since cp1252 seems to be the wrong encoding, try opening it as utf-8 or latin1 and see if that fixes it. > -- > Anjanesh Lekshmnarayanan > -- > http://mail.python.org/mailman/listinfo/python-list >
-- http://mail.python.org/mailman/listinfo/python-list