[EMAIL PROTECTED] writes:

> I am trying to extract some information from a few web pages, and I was
> using the HTMLParser module. It worked fine until it got to the
> javascript, at which it gave a parse error. Is there a good way to work
> around this or should I just preparse the file to remove the javascript
> manually? This is my first python program. 

sgmllib is very similar to HTMLParser, but doesn't break so easily
(but sgmllib has some problems with XHTML -- swings and roundabouts).

Or, try BeautifulSoup.


John
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to