Thank you, you code using pyparsing works very well. Now I got the "number" and the "url". But I still want to get the "name". I'll turn to pyparsing and see how to get the "name" from the html. But I hope you can enlighten me for one more time since I'm not farmiliar with the pyparsing module.
On 15 Aug 2005 21:15:02 -0700, Paul McGuire <[EMAIL PROTECTED]> wrote: > Given the example re that you've been trying to get working, here is a > pyparsing approach that might be more, um, approachable. > Unfortunately, since I don't have the URL of the page you are working > with, I'm unable to test this before posting. > > Good luck, > -- Paul > > # getMP3s.py > # get pyparsing at http://pyparsing.sourceforge.net > # > > from pyparsing import * > import urllib > > #~ > r=re.compile(ur'valign=top>(?P<number>\d{1,2})</td><td[^>]*>\s{0,2}' > > #~ ur'<a href="(?P<url>[^<>]+\.mp3)"( )target=_blank>' > #~ ur'(?P<name>.+)</td>',re.UNICODE|re.IGNORECASE) > > tdStart,tdEnd = makeHTMLTags("td") > aStart,aEnd = makeHTMLTags("a") > > number = Word(nums) > valign = CaselessLiteral("valign=top>") > > mp3Entry = valign + number.setResultsName("number") + tdEnd + \ > tdStart + SkipTo(aStart) + aStart + \ > SkipTo(tdEnd) + tdEnd > > # get list of mp3's > targetURL = "http://whatever" > targetPage = urllib.urlopen( targetURL ) > targetHTML = targetPage.read() > targetPage.close() > > for toks,s,e in mp3Entry.scanString(targetHTML): > print toks.number, toks.starta.href > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list