On Sat, Jun 5, 2010 at 4:03 PM, Paulo da Silva <psdasilva.nos...@netcabonospam.pt> wrote: > I need to read text files and process each line using string > comparisions and regexp. > > I have a python2 program that uses <file object>.readline to read each > line as a string. Then, processing it was a trivial job. > > With python3 I got error messagew like: > File "./pp1.py", line 93, in RL > line=inf.readline() > File "/usr/lib64/python3.1/codecs.py", line 300, in decode > (result, consumed) = self._buffer_decode(data, self.errors, final) > UnicodeDecodeError: 'utf8' codec can't decode bytes in position > 4963-4965: invalid data > > How do I handle this?
Specify the encoding of the text when opening the file using the `encoding` parameter. For Windows-1252 for example: your_file = open("path/to/file.ext", 'r', encoding='cp1252') Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list