Antoine Pitrou added the comment:

With the system Python on s10:

Python 2.6.8 (unknown, Apr 13 2012, 17:08:12) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.strxfrm('a')
'a'
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.strxfrm('a')
'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'
>>> locale.strxfrm('a').decode('utf-8')
u'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'

The difference between Python 2 and Python 3 is that Python 3 uses wcsxfrm, not 
strxfrm. Apparently Solaris' wcsxfrm is some broken thing that returns the same 
thing as strxfrm, cast to a wchar_t *, hence the character U+101010e 
(corresponding to the '\x01\x01\x01\x0e' bytestring above).

----------
nosy: +loewis, pitrou

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16258>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to