"JTree" <[EMAIL PROTECTED]> wrote: > >Hi,all > I encountered a problem when using unicode() function to fetch a >webpage, I don't know why this happenned. > My codes and error messages are: > > >Code: >#!/usr/bin/python >#Filename: test.py >#Modified: 2006-12-31 > >import cPickle as p >import urllib >import htmllib >import re >import sys > >def funUrlFetch(url): > lambda url:urllib.urlopen(url).read() > >objUrl = raw_input('Enter the Url:') >content = funUrlFetch(objUrl) >content = unicode(content,"gbk") >print content >content.close()
Once you fix the lambda, as Felipe described, there's another issue here. You are telling the unicode function that the string you're passing it is an 8-bit string encoded as gbk. How do you know that? In your specific example, www.msn.com, I can guarantee it will produce the wrong results: www.msn.com is encoded in UTF-8. -- Tim Roberts, [EMAIL PROTECTED] Providenza & Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list