On Thu, Dec 22, 2011 at 12:42 PM, Peter Otten <__pete...@web.de> wrote:
> The file is probably encoded in ISO-8859-1, ISO-8859-15, or cp1252 then: > > >>> print "\xe1".decode("iso-8859-1") > á > >>> print "\xe1".decode("iso-8859-15") > á > >>> print "\xe1".decode("cp1252") > á > > Try codecs.open() with one of these encodings. > I'm baffled. I duplicated your print statements but when I run this code (or any of the 3 encodings): file = codecs.open(p + "2.txt", "r", "cp1252") #file = codecs.open(p + "2.txt", "r", "utf-8") for line in file: print line I get this error: *UnicodeEncodeError*: 'ascii' codec can't encode character u'\xe1' in position 48: ordinal not in range(128) args = ('ascii', u'<i>Noticia: Este sitio web entre este portal est...r\xe1pidamente va a salir de aqu\xed.</i><br /><br />\r\n', 48, 49, 'ordinal not in range(128)') encoding = 'ascii' end = 49 object = u'<i>Noticia: Este sitio web entre este portal est...r\xe1pidamente va a salir de aqu\xed.</i><br /><br />\r\n' reason = 'ordinal not in range(128)' start = 48 Please advise. TIA, Stan
-- http://mail.python.org/mailman/listinfo/python-list