Re: Is Python good for web crawlers?

Xavier Morel Tue, 07 Feb 2006 13:35:51 -0800

Paul Rubin wrote:
> Generally I use urllib.read() to get
> the whole html page as a string, then process it from there.  I just
> look for the substrings I'm interested in, making no attempt to
> actually parse the html into a DOM or anything like that.
 >
BeautifulSoup works *really* well when you want to parse the source 
(e.g. when you don't want to use string matching, or when the structures 
you're looking for are a bit too complicated for simple string 
matching/substring search)


The API of the package is extremely simple, straightforward and... obvious.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Is Python good for web crawlers?

Reply via email to