Re: Python3: Sane way to deal with broken encodings

2009-12-08 Thread Martin v. Loewis
> Thus my Python script dies a horrible death: > > File "./update_db", line 67, in > for line in open(tempfile, "r"): > File "/usr/local/lib/python3.1/codecs.py", line 300, in decode > (result, consumed) = self._buffer_decode(data, self.errors, final) > UnicodeDecodeError: 'utf8' code

Re: Python3: Sane way to deal with broken encodings

2009-12-07 Thread Benjamin Kaplan
On Mon, Dec 7, 2009 at 2:16 PM, Johannes Bauer wrote: > Bruno Desthuilliers schrieb: > >>> Is that possible? If so, how? >> >> This might get you started: >> >> """ > help(str.decode) >> decode(...) >>     S.decode([encoding[,errors]]) -> object > > Hmm, this would work nicely if I called "dec

Re: Python3: Sane way to deal with broken encodings

2009-12-07 Thread Johannes Bauer
Bruno Desthuilliers schrieb: >> Is that possible? If so, how? > > This might get you started: > > """ help(str.decode) > decode(...) > S.decode([encoding[,errors]]) -> object Hmm, this would work nicely if I called "decode" explicitly - but what I'm doing is: #!/usr/bin/python3 for li

Re: Python3: Sane way to deal with broken encodings

2009-12-06 Thread Bruno Desthuilliers
Johannes Bauer a écrit : > Dear all, > > I've some applciations which fetch HTML docuemnts off the web, parse > their content and do stuff with it. Every once in a while it happens > that the web site administrators put up files which are encoded in a > wrong manner. > > Thus my Python script die

Python3: Sane way to deal with broken encodings

2009-12-06 Thread Johannes Bauer
Dear all, I've some applciations which fetch HTML docuemnts off the web, parse their content and do stuff with it. Every once in a while it happens that the web site administrators put up files which are encoded in a wrong manner. Thus my Python script dies a horrible death: File "./update_db"