"Benji99" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > Basically, I'm getting a htmlsource from a URL and need to > a.) find specific URLs > b.) find specific data > c.) with specific URLs, load new html pages and repeat. > <snip> > > Basically, I want to search through the whole string( > htmlSource), for a specific keyword, when it's found, I want to > know which line it's on so that I can retrieve that line and > then I should be able to parse/extract what I need using Regular > Expressions (which I'm getting quite confortable with). So how > can this be accomplished? > If you download pyparsing (at http://pyparsing.sourceforge.net), you'll find in the examples something very close to this called urlextractor.py (lists out all href's and their associated links on the page at www.yahoo.com).
-- Paul -- http://mail.python.org/mailman/listinfo/python-list