Given the example re that you've been trying to get working, here is a pyparsing approach that might be more, um, approachable. Unfortunately, since I don't have the URL of the page you are working with, I'm unable to test this before posting.
Good luck, -- Paul # getMP3s.py # get pyparsing at http://pyparsing.sourceforge.net # from pyparsing import * import urllib #~ r=re.compile(ur'valign=top>(?P<number>\d{1,2})</td><td[^>]*>\s{0,2}' #~ ur'<a href="(?P<url>[^<>]+\.mp3)"( )target=_blank>' #~ ur'(?P<name>.+)</td>',re.UNICODE|re.IGNORECASE) tdStart,tdEnd = makeHTMLTags("td") aStart,aEnd = makeHTMLTags("a") number = Word(nums) valign = CaselessLiteral("valign=top>") mp3Entry = valign + number.setResultsName("number") + tdEnd + \ tdStart + SkipTo(aStart) + aStart + \ SkipTo(tdEnd) + tdEnd # get list of mp3's targetURL = "http://whatever" targetPage = urllib.urlopen( targetURL ) targetHTML = targetPage.read() targetPage.close() for toks,s,e in mp3Entry.scanString(targetHTML): print toks.number, toks.starta.href -- http://mail.python.org/mailman/listinfo/python-list