Gabriel wrote: > Hello, > > I'm new in Python and i would like to write script which need to login > to a website. I'm experimenting with urllib2, > especially with something like this: > > opener = urllib2.build_opener(urllib2.HTTPCookieProcessor()) > urllib2.install_opener(opener) > > params = urllib.urlencode(dict(username='user', password='pass')) > f = opener.open('https://web.com', params) > data = f.read() > f.close() > > And the problem is, that this code logs me in on some sites, but on > others doesn't, especially on the one I really > need to login. And i don't know why. So is there some way how to debug > this code and find out why that script cannot > login on that specific site? > > Sorry if this question is too lame, but i am really beginner both in > python and web programming .) > That's actually pretty good code for a newcomer! There are a couple of issues you may be running into.
First, not all sites use "application-based" authentication - they may use HTTP authentication of some kind instead. In that case you have to pass the username and password as a part of the HTTP headers. Michael Foord has done a fair write-up of the issues at http://www.voidspace.org.uk/python/articles/authentication.shtml and you will do well to read that if, indeed, you need to do basic authentication. Second, if it *is* the web application that's doing the authentication in the sites that are failing (in other words if the credentials are passed in a web form) then your code may need adjusting to use other field names, or to include other data as required by the login form. You can usually find out what's required by reading the HTML source of the page that contains the login form. Thirdly [nobody expects the Spanish Inquisition ...], it may be that some sites are extraordinarily sensitive to programmed login attempts (possible due to spam), typically using a check of the "Agent:" HTTP header to "make sure" that the login attempt is coming from a browser and not a program. For sites like these you may need to emulate a browser response more fully. You can use a program like Wireshark to analyze the network traffic, though you can get add-ons for Firefox that will show you the HTTP headers on request and response. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list