eryksun added the comment:

This solution no longer works. If the system is configured to use the Japanese 
system locale and language pack, then 3.4.3 returns codepage 932 mojibake for 
the "%Z" time zone name. Originally [this approach worked][1] because it called 
PyUnicode_Decode using the 'mbcs' encoding.
Currently it calls PyUnicode_DecodeLocaleAndSize, which just ends up calling 
mbstowcs. That's pretty much what wcsftime does. In the default C locale, 
mbstowcs casts the byte values to wchar_t:

    >>> time.strftime('%Z')
    '\x91\xbe\x95\xbd\x97m\x89\xc4\x8e\x9e\x8a\xd4'
    >>> time.strftime('%Z').encode('latin-1').decode('932')
    '太平洋夏時間'

The problem is worse for 3.5 built with VC++ 14. In the new CRT strftime 
decodes the format string via MultiByteToWideChar, calls _Wcsftime_l, and 
encodes the result back via WideCharToMultiByte. The outer conversions use the 
default LC_TIME codepage, which is ANSI (ACP), so they're not the problem. The 
problem is the internal _mbstowcs_s_l conversion of the ANSI time zone name, 
which creates the above-shown mojibake 'unicode' string. This is then 
compounded by calling WideCharToMultiByte on the result:

    >>> time.strftime('%Z')
    '?????m?A???O'

There's no way to fix this by transcoding. The result is just garbage.

[1]: https://hg.python.org/cpython/file/79e60977fc04/Modules/timemodule.c#l501

----------
nosy: +eryksun
versions: +Python 3.4, Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10653>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to