Re: Python Regex Question

crybaby Thu, 20 Sep 2007 16:56:15 -0700

On Sep 20, 4:12 pm, Tobiah <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > I need to extract the number on each <td tags from a html file.
>
> > i.e 49.950 from the following:
>
> > <td align=right width=80><font size=2 face="New Times
> > Roman,Times,Serif">&nbsp;49.950&nbsp;</font></td>
>
> > The actual number between: &nbsp;49.950&nbsp; can be any number of
> > digits before decimal and after decimal.
>
> > <td align=right width=80><font size=2 face="New Times
> > Roman,Times,Serif">&nbsp;######.####&nbsp;</font></td>
>
> > How can I just extract the real/integer number using regex?
>
> '[0-9]*\.[0-9]*'
>
> --
> Posted via a free Usenet account fromhttp://www.teranews.com


I am trying to use BeautifulSoup:

    soup = BeautifulSoup(page)

    td_tags = soup.findAll('td')
    i=0
    for td in td_tags:
        i = i+1
        print "td: ", td
        # re.search('[0-9]*\.[0-9]*', td)
        price = re.compile('[0-9]*\.[0-9]*').search(td)

I am getting an error:

           price= re.compile('[0-9]*\.[0-9]*').search(td)
TypeError: expected string or buffer

Does beautiful soup returns array of objects? If so, how do I pass
"td" instance as string to re.search?  What is the different between
re.search vs re.compile().search?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python Regex Question

Reply via email to