On 19/08/12 19:48:06, Paul Rubin wrote: > Terry Reedy <tjre...@udel.edu> writes: >> py> s = chr(0xFFFF + 1) >> py> a, b = s > That looks like a 3.2- narrow build. Such which treat unicode strings > as sequences of code units rather than sequences of codepoints. Not an > implementation bug, but compromise design that goes back about a > decade to when unicode was added to Python.
Actually, this compromise design was new in 3.0. In 2.x, unicode strings were sequences of code points. Narrow builds rejected any code points > 0xFFFF: Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> s = unichr(0xFFFF + 1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: unichr() arg not in range(0x10000) (narrow Python build) -- HansM -- http://mail.python.org/mailman/listinfo/python-list