Ravi Teja wrote: > > 1.) XPath is not a good idea at all with "malformed" HTML or perhaps > web pages in general.
import libxml2dom import urllib f = urllib.urlopen("http://wiki.python.org/moin/") s = f.read() f.close() # s contains HTML not XML text d = libxml2dom.parseString(s, html=1) # get the community-related links for label in d.xpath("//li[.//a/text() = 'Community']//li//a/text()"): print label.nodeValue Of course, lxml should be able to do this kind of thing as well. I'd be interested to know why this "is not a good idea", though. Paul -- http://mail.python.org/mailman/listinfo/python-list