Markus Franz wrote:
Hi.

I used urllib2 to load a html-document through http. But my problem
is:
The loaded contents are returned as binary data, that means that every
character is displayed like lÀÃt, for example. How can I get the
contents as normal text?

My guess is the html is utf-8 encoded - your sample looks like utf-8-interpreted-as-latin-1. Try contents = f.read().decode('utf-8')

Kent


My script was:

import urllib2
req = urllib2.Request(url)
f = urllib2.urlopen(req)
contents = f.read()
print contents
f.close()

Thanks!

Markus
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to