On Tue, Mar 6, 2012 at 2:43 PM, John Salerno <johnj...@gmail.com> wrote: > I sort of have to work with what the website gives me (as you'll see below), > but today I encountered an exception to my RE. Let me just give all the > specific information first. The point of my script is to go to the specified > URL and extract song information from it. > > This is my RE: > > song_pattern = re.compile(r'([0-9]{1,2}:[0-9]{2} > [a|p].m.).*?<a.*?>(.*?)</a>.*?<a.*?>(.*?)</a>', re.DOTALL)
I would advise against using regular expressions to "parse" HTML: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags lxml is a popular choice for parsing HTML in Python: http://lxml.de Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list