New submission from Tim Lesher <[EMAIL PROTECTED]>:

The urllib.quote docstring implies that it quotes only characters in RFC
2396's "reserved" set.

However, urllib.quote currently escapes all characters except those in
an "always_safe" list, which consists of alphanumerics and three
punctuation characters, "_.-".

This behavior is contrary to the RFC, which defines "unreserved"
characters as alphanumerics plus "mark" characters, or "-_.!~*'()".  

The RFC also says:

  Unreserved characters can be escaped without changing the semantics
  of the URI, but this should not be done unless the URI is being used
  in a context that does not allow the unescaped character to appear.

This seems to imply that "always_safe" should correspond to the RFC's
"unreserved" set of "alphanum" | "mark".

----------
components: Library (Lib)
messages: 65518
nosy: tlesher
severity: normal
status: open
title: urllib.quote() escapes characters unnecessarily and contrary to docs
type: behavior
versions: Python 2.5

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2637>
__________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to