karl added the comment:

hehe. No hard feelings. I still do not think it is a good idea to test the 
"error code" and its associated message in the same test. :)

For example, in RFC2616, 414 is defined as 

    414 Request-URI Too Long

and in the HTTP1.1bis (which will not get a new version number) because the 
goal of the work was to just clarify and not make incompatible changes, 414 is 
defined as 

    414 URI Too Long

which is fine because the message is optional. With the current tests, it will 
make it hard to modify :)
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l627

# More about this specific issue

Right now, send_error groks everything, which is not very good in terms of 
security and side effects. 
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

def send_error(self, code, message=None):

Then later on:
try:
   shortmsg, longmsg = self.responses[code]

* shortmsg is supposed to bewhat is written in the spec.
* longmsg  is specific to the python project. 

When the message is not defined it takes the shortmsg
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l421

but if defined it sends everything whatever it is
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l428

Checking the status-line 
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.1.2

3.1.2. Status Line


   The first line of a response message is the status-line, consisting
   of the protocol version, a space (SP), the status code, another
   space, a possibly-empty textual phrase describing the status code,
   and ending with CRLF.

     status-line = HTTP-version SP status-code SP reason-phrase CRLF

   The status-code element is a 3-digit integer code describing the
   result of the server's attempt to understand and satisfy the client's
   corresponding request.  The rest of the response message is to be
   interpreted in light of the semantics defined for that status code.
   See Section 6 of [Part2] for information about the semantics of
   status codes, including the classes of status code (indicated by the
   first digit), the status codes defined by this specification,
   considerations for the definition of new status codes, and the IANA
   registry.

     status-code    = 3DIGIT

   The reason-phrase element exists for the sole purpose of providing a
   textual description associated with the numeric status code, mostly
   out of deference to earlier Internet application protocols that were
   more frequently used with interactive text clients.  A client SHOULD
   ignore the reason-phrase content.

     reason-phrase  = *( HTAB / SP / VCHAR / obs-text )


* So client SHOULD ignore the reason-phrase
* The reason-phrase can only contains 
** HTAB (horizontal tab)
** SP (space)
** VCHAR (any visible [USASCII] character)
** obs-text
*** As a convention, ABNF rule names prefixed with "obs-" denote "obsolete" 
grammar rules that appear for historical reasons.
*** obs-text = %x80-FF (range of characters)


A hacky way to mitigate the issue would be to 

1. extract the first line (stop after the first CRLF)
2. sanitize this line. (allow only "*(HTAB / SP / VCHAR / %x80-FF)" )
3. send only this.

Thoughts?
Honestly, I do not find very satisfying. ;) but at least it should not break 
everything.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12921>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to