Re: Get document as normal text and not as binary data

Kent Johnson Mon, 28 Mar 2005 11:40:05 -0800

Markus Franz wrote:

Hi.

I used urllib2 to load a html-document through http. But my problem
is:
The loaded contents are returned as binary data, that means that every
character is displayed like lÃ€Ãt, for example. How can I get the
contents as normal text?


My guess is the html is utf-8 encoded - your sample looks like 
utf-8-interpreted-as-latin-1. Try
contents = f.read().decode('utf-8')

Kent


My script was:

import urllib2
req = urllib2.Request(url)
f = urllib2.urlopen(req)
contents = f.read()
print contents
f.close()

Thanks!

Markus

--
http://mail.python.org/mailman/listinfo/python-list

Re: Get document as normal text and not as binary data

Reply via email to