Stan Iverson wrote: > On Thu, Dec 22, 2011 at 11:30 AM, Rami Chowdhury > <rami.chowdh...@gmail.com>wrote: > >> Could you try using the 'open' function from the 'codecs' module? >> > > I believe this is what you meant: > > file = codecs.open(p + "2.txt", "r", "utf-8") > for line in file: > print line > > but got this error: >
> *UnicodeDecodeError*: 'utf8' codec can't decode bytes in position 0-2: > invalid data > args = ('utf8', '\xe1 intentado para ellos bastante sabios para > discernir lo obvio. Tales perso', 0, 3, 'invalid data') > which is the letter á (a with accent). The file is probably encoded in ISO-8859-1, ISO-8859-15, or cp1252 then: >>> print "\xe1".decode("iso-8859-1") á >>> print "\xe1".decode("iso-8859-15") á >>> print "\xe1".decode("cp1252") á Try codecs.open() with one of these encodings. -- http://mail.python.org/mailman/listinfo/python-list