On 10/23/2011 9:59 AM, 水静流深 wrote:
i change my code into :
Calling your file xml.py (as indicated below) is a potentially bad idea since the Python stdlib has a package named 'xml'. If you write 'import xml.xxx' in another file in the same directory, Python will try to find 'xxx' in your xml.py file.
import urllib.request, urllib.parse, urllib.error import lxml.html
Are you sure you have a version of lxml that works with Python 3?
down='http://frux.wikispaces.com/' root=urllib.request.urlopen(down).read()
What type of object is returned and bound to root? (print(type(root)) if doc not clear.)
root=lxml.html.fromstring(root)
What type of object is root required to be (from lxml docs)? [snip]
the new problem is : C:\Python32>python c:\xml.py Traceback (most recent call last): File "c:\xml.py", line 5, in <module> root=lxml.html.fromstring(root) File "C:\Python32\lib\site-packages\lxml\html\__init__.py", line 630, in fromstring if start.startswith('<html') or start.startswith('<!doctype'): TypeError: startswith first arg must be bytes or a tuple of bytes, not str
This implies that the name 'start' is bound to bytes when it should be (for 3.2) bound to unicode, which would most likely mean that 'root' is the wrong type. Or that the above is the 2.x version of lxml where '<html' is bytes rather than unicode.
-- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list