On Fri, 29 Aug 2008, bruce wrote:

Hi john.

Thanks for your reply. I tried your suggestion of using RobustFactory, and
still get a badly maligned html back!!! The html is listed below. I would

That's expected -- this affects the parsing of the HTML. It does not modify the HTML.


have thought that the mech process, would have interpreted the
"http-equiv="refresh" Unfortunately, mechanize apparently isn't able to
handle a "<meta http-equiv="refresh" url="/foo/..."> when it's inside the
<body> of the html...

Yes, only the head element is read (albeit with a slightly fuzzy definition of "head element").

In a theoretical future unstable branch, that might change, but currently mechanize doesn't try all that hard to work well with bad HTML.

Currently, you have to work around this kind of issue. You can perform the refresh manually, or modify the HTML and call .set_response(), or replace the HTTPEquivProcessor with your own (you could use HTTPEquivProcessor itself -- you can pass a parser factory function to its constructor).


John

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to