Steven Bethard <[EMAIL PROTECTED]> writes: > I'd hate to steer a potential new Python developer to a clumsier
"clumsier"??? Try to parse this with your program: page2 = ''' <html><head><title>URLs</title></head> <body> <ul> <li><a href="http://domain1/page1">some page1</a></li> <li><a href="http://domain2/page2">some page2</a></li> </body></html> ''' > library when Python 2.5 includes ElementTree:: > > import xml.etree.ElementTree as etree > > page = ''' > <html><head><title>URLs</title></head> > <body> > <ul> > <li><a href="http://domain1/page1">some page1</a></li> > <li><a href="http://domain2/page2">some page2</a></li> > </ul> > </body></html> > ''' > > tree = etree.fromstring(page) > for a_node in tree.getiterator('a'): > url = a_node.get('href') > if url is not None: > print url It might be even one-liner: print "\n".join((url.get('href', '') for url in tree.findall(".//a"))) But as far as HTML (not XML) is concerned this is not very realistic solution. > > I know that the wiki page is supposed to be Python 2.4 only, but I'd > rather have no example than an outdated one. This example is by no means "outdated". -- Regards, Rob -- http://mail.python.org/mailman/listinfo/python-list