Guido van Rossum <[EMAIL PROTECTED]> added the comment: > Matt Giuca <[EMAIL PROTECTED]> added the comment: > By the way, what is the current status of this bug? Is anybody waiting > on me to do anything? (Re: Patch 9)
I'll be reviewing it today or tomorrow. From looking at it briefly I worry that the implementation is pretty slow -- a method call for each character and a map() call sounds pretty bad. > To recap my previous list of outstanding issues raised by the review: > >> Should unquote accept a bytes/bytearray as well as a str? > Currently, does not. I think it's meaningless to do so (and how to > handle >127 bytes, if so?) The bytes > 127 would be translated as themselves; this follows logically from how stuff is parsed -- %% and %FF are translated, everything else is not. But I don't really care, I doubt there's a need. >> Lib/email/utils.py: >> Should encode_rfc2231 with charset=None accept strings with non-ASCII >> characters, and just encode them to UTF-8? > Currently does. Suggestion to restrict to ASCII on the review tracker; > simple fix. I think I agree with that comment; it seems wrong to return UTF8 without setting that in the header. The alternative would be to default charset to utf8 if there are any non-ASCII chars in the input. I'd be okay with that too. >> Should quote raise a TypeError if given a bytes with encoding/errors >> arguments? (Motivation: TypeError is what you usually raise if you >> supply too many args to a function). > Resolved. Raises TypeError. > >> Lib/urllib/parse.py: >> (As discussed above) Should quote accept safe characters outside the >> ASCII range (thereby potentially producing invalid URIs)? > Resolved? Implemented, but too messy and not worth it just to produce > invalid URIs, so NOT in patch. Agreed, safe should be ASCII chars only. > That's only two very minor yes/no issues remaining. Please comment. I believe patch 9 still has errors defaulting to strict for quote(). Weren't you going to change that? Regarding using UTF-8 as the default encoding, I still think this the right thing to do -- while the tables shown by Bill indicate that there's still a lot of Latin-1 out there, UTF-8 is definitely gaining on it, and I expect that Python apps, especially Py3k apps, are much more likely to follow (and hopefully reinforce! :-) this trend than to lag behind. _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3300> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com