New submission from C. Scott Ananian <[EMAIL PROTECTED]>: Although HTTP/1.1 says that servers SHOULD send a Connection: close header if they intend to close a persistent connection (sec 8.1.2.1), clients (like httplib) "MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction so long as the request sequence is idempotent" (sec 8.1.4) since servers MAY close a persistent connection after a request due to a timeout or other reason. Further, "Clients and servers SHOULD both constantly watch for the other side of the transport close, and respond to it as appropriate." (sec 8.1.4).
httplib currently does not detect when the server closes its side of the connection, until the following bit of HTTPResponse in httplib.py (python2.5.2): def _read_status(self): # Initialize with Simple-Response defaults line = self.fp.readline() ... if not line: # Presumably, the server closed the connection before # sending a valid response. raise BadStatusLine(line) ... So we end up raising a BadStatusLine exception for a completely permissible case: the server closed a persistent connection. This causes persistent connections to fail for users of httplib in mysterious ways, especially if they are behind proxies: Squid, for example, seems to limit persistent connections to a maximum of 3 requests, and then closes the connection, causing future requests to raise the BadStatusLine. There appears to be code attempting to fix this problem in HTTPConnection.request(), but it doesn't always work. RFC793 says, "If an unsolicited FIN arrives from the network, the receiving TCP can ACK it and tell the user that the connection is closing. The user will respond with a CLOSE, upon which the TCP can send a FIN to the other TCP after sending any remaining data." (sec 3.5 case 2) Key phrase here is "after sending any remaining data": python is usually allowed to put the request on the network without raising a socket.error; the close is not signaled to python until HTTPResponse.begin() invokes HTTPResponse._read_status. It would be best to extend the retry logic in request() to cover this case as well (store away the previous parameters to request() and if _read_status() fails invoke HTTPConnection.close(), HTTPConnection.connect(), HTTPConnection.request(stored_params), and retry the HTTPConnection.getresponse(). But at the very least python should document and raise a EAGAIN exception of some kind so that the caller can distinguish this case from an actual bad status line and retry the request. ---------- components: Library (Lib) messages: 71221 nosy: cananian severity: normal status: open title: httplib persistent connections violate MUST in RFC2616 sec 8.1.4. versions: Python 2.5 _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3566> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com