I'm trying to automatically log into a site and store the resulting html using python. The site uses a form and encrypts the password with some kind of md5 hash.
This is the important parts of the form: <script language="JavaScript" src="/admin/javascript/md5.js"></script> <script language="JavaScript"><!-- var pskey = "770F11B12EBB7D15058170FA3AD12E685D3A46112B841B1E7BE375F600E62705"; //--> </script> <form name="LoginForm" action="/guardian/home.html" method="POST" target="_top" onsubmit="doStudentLogin(this);"> <input type="hidden" name="pstoken" value="3895"> <input type="text" name="account" value="" size="35"> <input type="password" name="pw" value="" size="35"> <input type="image" src="/images/btn_enter.gif" width="89" height="27" border="0" alt="Enter"> </form> This is the function called in md5.js: function doStudentLogin(form) { var pw = form.pw.value; var pw2 = pw; // Save a copy of the password preserving case pw = pw.toLowerCase(); form.pw.value = hex_hmac_md5(pskey, pw); if (form.ldappassword!=null) { // LDAP is enabled, so send the clear-text password // Customers should have SSL enabled if they are using LDAP form.ldappassword.value = pw2; // Send the unmangled password } return true; } I am not sure what the ldappassword is or does. Can some one explain that? Here's my code : from urllib import urlopen, urlencode import re import hmac account = 'account' psw = 'my password'' url = "http://ps.pvcsd.org/guardian/home.html" homepagetxt = urlopen("http://ps.pvcsd.org").read() # get key and pstoken from login page m = re.search('<input type="hidden" name="pstoken" value="(?P<id>[0-9]+)"', homepagetxt) token = m.group('id') m = re.search('var pskey = "(?P<id>[a-zA-Z0-9]+)"', homepagetxt) key = m.group('id') hobj = hmac.new(key, psw) psw = hobj.hexdigest() # encrypt the password data = { 'pstoken' : token, 'account' : account, 'pw' : psw } e = urlencode(data) page = urlopen(url, e) txt = page.read() f = open("text.html", 'w') f.write(txt) f.close() This doesn't however, it just sends me back to the main login page, doenst say invalid password or anything. I've checked, yes the python hmac hash function produces the same results (encrypted password) as the md5.js file. Does anyone know what I am doing wrong??
-- http://mail.python.org/mailman/listinfo/python-list