Re: B-Soup: broken iterator, tag a keyword?

Stefan Behnel Thu, 10 Jul 2008 22:36:25 -0700

Hi,

Brendan wrote:
> I have the following using Beautiful Soup:
> 
> soup = BeautifulSoup(data)
> tags = soup.findAll(href=re.compile("/MER_FRS_L2_Canada/MER_FRS_\S
> +gz"))
> for tag in tags:
>     print tag['href']
>     print tag.parent.nextSibling.string
>     print tag.parent.nextSibling.nextSibling.string
>     print tag.parent.nextSibling.nextSibling.nextSibling.string
>     print
> tag.parent.nextSibling.nextSibling.nextSibling.nextSibling.contents[0].string


It's pretty impossible that the problem is the name "tag" here. But since you
didn't state what the actual problem is, let me suggest not to parse markup
with regular expressions in general (which BS does). Use a real XML/HTML
parser for that. lxml will work just fine (and it also has a nicer API).

http://codespeak.net/lxml/

Stefan
--
http://mail.python.org/mailman/listinfo/python-list

Re: B-Soup: broken iterator, tag a keyword?

Reply via email to