"mtuller" <[EMAIL PROTECTED]> on 10 Feb 2007 15:03:36 -0800 didst step forth and proclaim thus:
> Alright. I have tried everything I can find, but am not getting > anywhere. I have a web page that has data like this: [snip] > What is show is only a small section. > > I want to extract the 33,699 (which is dynamic) and set the value to a > variable so that I can insert it into a database. [snip] > I have also tried Beautiful Soup, but had trouble understanding the > documentation. ==================== from BeautifulSoup import BeautifulSoup as parser soup = parser("""<tr > <td headers="col1_1" style="width:21%" > <span class="hpPageText" >LETTER</span></td> <td headers="col2_1" style="width:13%; text-align:right" > <span class="hpPageText" >33,699</span></td> <td headers="col3_1" style="width:13%; text-align:right" > <span class="hpPageText" >1.0</span></td> <td headers="col4_1" style="width:13%; text-align:right" > </tr>""") value = \ int(soup.find('td', headers='col2_1').span.contents[0].replace(',', '')) ==================== > Thanks, > Mike Hope that helped. This code assumes there aren't any td tags with header=col2_1 that come before the value you are trying to extract. There's several ways to do things in BeautifulSoup. You should play around with BeautifulSoup in the interactive prompt. It's simply awesome if you don't need speed on your side. -- Sam Peterson skpeterson At nospam ucdavis.edu "if programmers were paid to remove code instead of adding it, software would be much better" -- unknown -- http://mail.python.org/mailman/listinfo/python-list