Bugs item #1153027, was opened at 2005-02-27 20:16 Message generated for change (Comment added) made by jjlee You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1153027&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: pristine777 (pristine777) Assigned to: Nobody/Anonymous (nobody) Summary: http_error_302() crashes with 'HTTP/1.1 400 Bad Request Initial Comment: I was able to get to a website by using both IE and FireFox but my Python code kept giving HTTP 400 Bad request error. To debug, I set set_http_debuglevel(1) as in the following code: hh = urllib2.HTTPHandler() hh.set_http_debuglevel(1) opener = urllib2.build_opener (hh,urllib2.HTTPCookieProcessor(self.cj)) The printed debug messages show that this crash happens when there is a space in the redirected location. Here's a cut-and-paste of the relevant debug messages (note the line starting with send that http_error_302 is sending): reply: 'HTTP/1.1 302 Moved Temporarily\r\n' header: Connection: close header: Date: Sun, 27 Feb 2005 19:52:51 GMT header: Server: Microsoft-IIS/6.0 <---other header data--> send: 'GET /myEmail/User?asOf=02/26/2005 11:38:12 PM& ddn=87cb51501730 <---remaining header data--> reply: 'HTTP/1.1 400 Bad Request\r\n' header: Content-Type: text/html header: Date: Sun, 27 Feb 2005 19:56:45 GMT header: Connection: close header: Content-Length: 20 To fix this, I first tried to encode the redirected location in the function http_error_302() in urllib2 using the methods urllib.quote and urllib.urlencode but to no avail (they encode other data as well). A temporary solution that works is to replace any space in the redirected URL by'%20'. Below is a snippet of the function http_error_302 in urllib2 with this suggested fix: def http_error_302(self, req, fp, code, msg, headers): # Some servers (incorrectly) return multiple Location headers # (so probably same goes for URI). Use first header. if 'location' in headers: newurl = headers.getheaders('location')[0] elif 'uri' in headers: newurl = headers.getheaders('uri')[0] else: return newurl=newurl.replace(' ','%20') # <<< TEMP FIX - inserting this line temporarily fixes this problem newurl = urlparse.urljoin(req.get_full_url(), newurl) <--- remainder of this function --> Thanks! ---------------------------------------------------------------------- Comment By: John J Lee (jjlee) Date: 2005-05-19 20:30 Message: Logged In: YES user_id=261020 Sure, but if Firefox and IE do it, probably we should do the same. I think cookielib.escape_path(), or something similar (perhaps without the case normalisation) is probably the right thing to do. That's not part of any documented API; I suppose that function or a similar one should be added to module urlparse, and used by urllib2 and urllib when redirecting. ---------------------------------------------------------------------- Comment By: Jeff Epler (jepler) Date: 2005-03-01 17:41 Message: Logged In: YES user_id=2772 When the server sends the 302 response with 'Location: http://example.com/url%20with%20whitespace', urllib2 seems to work just fine. I believe based on reading rfc2396 that a URL that contains spaces must contain quoted spaces (%20) not literal spaces, because space is not an "unreserved character" [2.3] and "[d]ata must be escaped if it does not have a representation using an unreserved character" [2.4]. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1153027&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com