On Wed, Jul 8, 2009 at 3:06 PM, David<david.bra...@googlemail.com> wrote: > Hi > > I have a few regexs I need to do, but im struggling to come up with a > nice way of doing them, and more than anything am here to learn some > tricks and some neat code rather than getting an answer - although > thats obviously what i would like to get to. > > Problem 1 - > > <span class="chg" > id="ref_678774_cp">(25.47%)</span><br> > > I want to extract 25.47 from here - so far I've tried - > > xPer = re.search('<span class="chg" id="ref_"'+str(xID.group(1))+'"_cp > \">(.*?)%', content) > > and > > xPer = re.search('<span class=\"chg\" id=\"ref_"+str(xID.group(1))+"_cp > \">\((\d*)%\)</span><br>', content) > > neither of these seem to do what I want - am I not doing this > correctly? (obviously!) > > Problem 2 - > > <td> </td> > > <td width="1%" class=key>Open: > </td> > <td width="1%" class=val>5.50 > </td> > <td> </td> > <td width="1%" class=key>Mkt Cap: > </td> > <td width="1%" class=val>6.92M > </td> > <td> </td> > <td width="1%" class=key>P/E: > </td> > <td width="1%" class=val>21.99 > </td> > > > I want to extract the open, mkt cap and P/E values - but apart from > doing loads of indivdual REs which I think would look messy, I can't > think of a better and neater looking way. Any ideas?
Use an actual HTML parser? Like BeautifulSoup (http://www.crummy.com/software/BeautifulSoup/), for instance. I will never understand why so many people try to parse/scrape HTML/XML with regexes... Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list