On Sun, Dec 30, 2012 at 10:32 PM, contro opinion <contropin...@gmail.com> wrote: > import urllib > import lxml.html > down='http://blog.sina.com.cn/s/blog_71f3890901017hof.html' > file=urllib.urlopen(down).read() > root=lxml.html.document_fromstring(file) > body=root.xpath('//div[@class="articalContent "]')[0] > print body.text_content() > > When i run the code, what i get is the text content ,how can i get the html > source code of it?
print lxml.html.tostring(body) Did you read through the lxml.html documentation? http://lxml.de/lxmlhtml.html It includes several examples that make use of lxml.html.tostring(). RTFineM-ly Yours, Chris -- http://mail.python.org/mailman/listinfo/python-list