Senthil Kumaran <sent...@uthcode.com> added the comment: The concern here is if the request line had something like this.
Method SP Request-URI SP HTTP-Version <ANY_\r_\n_\r\n_Combination>\r\n The previous behavior would have resulted in Method SP Request-URI SP HTTP-Version <ANY_\r_\n_\r\n_Combination> That is removing only the final \r\n, whereas the current change would make it Method SP Request-URI SP HTTP-Version That is removes all the trailing \r\n combination. BTW, thing to note this, this is only for request line and not the header lines. And for request-line, both HTTP 1.0 and HTTP 1.1 spec has this in section 5.1 5.1 Request-Line The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF are allowed except in the final CRLF sequence. Request-Line = Method SP Request-URI SP HTTP-Version CRLF Which leads me to believe that, removing all the trailing \r\n is a fine thing to do and should not be harmful. Just to augment this with few other things I found while (re-)reading the spec. This advise is different from Header's trailing whitespace, which is called Linear White space (LWS). If the Host Header looks like, e.g. "Host: www.foo.com \r\n" (notice the trailing white space), According to RFC 2616 (HTTP 1.1), section 4.2 Message Headers: The field-content does not include any leading or trailing LWS: linear white space occurring before the first non-whitespace character of the field-value or after the last non-whitespace character of the field-value. Such leading or trailing LWS MAY be removed without changing the semantics of the field value. RFC 1945 (HTTP 1.0), section 4.2 Message Headers does not make such an explicit statement. My guess on the former behavior in http/server.py is that it was thought that Request-Line was following something like section 4.2 on HTTP 1.0 spec and only the last two characters were removed. But actually, the request-line as per spec should have only one CRLF as end char. In the Docstring of the BaseHTTPServer class, there is a mention about preserving the trailing white-space, but it does not point to any authoritative reference, so I am sure taking docstring as reference to preserve the behavior is a good idea. Before dwelling to find the reason, I was thinking if reverting the patch in 2.7 and 3.1 would be a good idea. But give that change has support from older specs to new ones, I am inclined to think that leave the change as such (without reverting) should be fine as well. Only if we find a stronger backwards compatibility argument for leaving trailing \r\n in request-line, then we should remove it in 2.7 and 3.2, otherwise we can leave it as such. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13294> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com