Diez B. Roggisch wrote: > It makes no sense having urllib generating exceptions for such a case. From > its point of view, things work pefectly - it got a result. No network error > or whatsoever. > > Its your application that is not happy with the result - but it has to > figure that out by itself. > > You could for instance try and see what kind of result you got using the > unix file command - it will tell you that you received a html file, not a > deb. > > Or check the mimetype returned - its text/html in the error case of yours, > and most probably something like application/octet-stream otherwise. > > Regards, > > Diez
Also be aware that many webservers (especially IIS ones) are configured to return some kind of custom page instead of a stock 404, and you might be getting a 200 status code even though the page you requested is not there. So depending on what site you are scraping, you might have to read the page you got back to figure out if it's what you wanted. -- Wade Leftwich Ithaca, NY -- http://mail.python.org/mailman/listinfo/python-list