In article <mailman.1267.1382220612.18130.python-l...@python.org>,
 Chris Angelico <ros...@gmail.com> wrote:

> On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith <r...@panix.com> wrote:
> > So, yesterday, I tracked down an uncaught exception stack in our logs to a 
> > user whose username included the unicode character 'SMILING FACE WITH 
> > SUNGLASSES' (U+1F60E).  It turns out, that's perfectly fine as a user name, 
> > except that in one obscure error code path, we try to str() it during some 
> > error processing.
> 
> How is that a problem? Surely you have to deal with non-ASCII
> characters all the time - how is that particular one a problem? I'm
> looking at its UTF-8 and UTF-16 representations and not seeing
> anything strange, unless it's the \x0e in UTF-16 - but, again, you
> must surely have had to deal with
> non-ASCII-encoded-whichever-way-you-do-it.
> 
> Or are you saying that that particular error code path did NOT handle
> non-ASCII characters?

Exactly.  The fundamental error was caught, and then we raised another 
UnicodeEncodeError generating the text of the error message to log!

> If so, that's a strong argument for moving to
> Python 3, to get full Unicode support in _all_ branches.

Well, yeah.  The problem is, my pip requirements file lists 76 modules 
(and installing all those results in 144 modules, including the cascaded 
dependencies).  Until most of those are P3 ready, we can't move.

Heck, I can't even really move off 2.6 because we use Amazon's EMR 
service, which is stuck on 2.6.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to