Mark Dickinson <[EMAIL PROTECTED]> added the comment: I'm now very confused.
In trying to follow things of type wchar_t* around the Python source, I discovered PyUnicode_FromWideChar in unicodebject.c. For OS X, the conversion lands in the following code, where w is the incoming WideChar array, declared as wchar_t *. register Py_UNICODE *u; register Py_ssize_t i; u = PyUnicode_AS_UNICODE(unicode); for (i = size; i > 0; i--) *u++ = *w++; But this looks wrong: on OS X, sizeof(wchar_t) is 4 and I think w is encoded in UTF-32. So I was expecting to see some kind of explicit conversion from UTF-32 to UCS-2 here. Instead, it looks as though the incoming values are implicitly truncated from 32 bits to 16. Doesn't this do the wrong thing for characters outside the BMP? Should I open an issue for this, or am I simply misunderstanding? _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4388> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com