sophie_newbie wrote: > Hi, I'm wondering how i'd go about extracting a string array of all > comments in a HTML file, HTML comments obviously taking the format > "<!-- Comment text here -->". > > I'm fairly stumped on how to do this? Maybe using regular expressions?
from lxml import etree
parser = etree.HTMLParser()
tree = etree.parse("somefile.html", parser)
print tree.xpath("//comment()")
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
