[issue10542] Py_UNICODE_NEXT and other macros for surrogates

Alexander Belopolsky Wed, 29 Dec 2010 15:05:04 -0800

Alexander Belopolsky <belopol...@users.sourceforge.net> added the comment:

On Wed, Dec 29, 2010 at 3:36 PM, STINNER Victor <rep...@bugs.python.org> wrote:
..
> Use non-ASCII identifiers is exotic. Use non-BMP identifiers is
> crazy :-)

Hmm, we clearly disagree on what crosses the boundary of the mental
norm.   IMHO, it is crazy to require users to care which plane their
characters come from or whether their programs will be run on a wide
or a narrow build.  I see nothing wrong with a desire to use
characters from say "Mathematical Alphanumeric Symbols" block if that
makes some Python expressions look more like the mathematical formulas
that they represent.  However, it is not about any particular usage,
but about the language definition.  I don't remember even a suggestion
during PEP 3131 discussion that non-BMP characters should be excluded
from identifiers wholesale.

In any case, can someone remind me what was the use case that
motivated chr(i) returning a two-character string for i > 0xFFFF?  I
think we should either stop pretending that narrow builds can handle
non-BMP characters and disallow them in Python strings or we should
try to fix the bugs associated with them.

> Seriously, it can wait 3.3.

What exactly can wait until 3.3?  The presented patch introduces no
user visible changes.  It is only a stepping stone to restoring some
sanity in a way supplementary characters are treated by narrow builds.
 At the moment, it is a mine field: you can easily produce surrogate
pairs from string literals and codecs, but when you start using them,
you have 50% chance that things will blow up, 40% chance of getting
wrong result and maybe 10% chance that it will work.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10542] Py_UNICODE_NEXT and other macros for surrogates

Reply via email to