yeah,u r right, the page uses chinese.(I'm a chinese too.^_^,) using urllib2.urlopen('............').read(),I can't get the contents between '<body>' and '</body>' ,the reason isn't the chinese encoding but the 'no-cache' set,I think.
I want to get the contents between.... can you find the problem why i can't read the contents? thanks. -- http://mail.python.org/mailman/listinfo/python-list