New submission from STINNER Victor <victor.stin...@haypocalc.com>:

To decode byte string from the locale encoding (LC_CTYPE), 
PyUnicode_DecodeFSDefault() can be used, but this function uses a constant 
encoding set at startup (the locale encoding at startup). The right method is 
currently to call _Py_char2wchar() and then PyUnicode_FromWideChar(). 
_Py_char2wchar() is a low level function, it doesn't raise nice Python 
exception, but just return NULL on error and write a message to stderr using 
fprintf() (!).

Attached patch adds PyUnicode_DecodeLocale() and 
PyUnicode_DecodeLocaleAndSize() to offer a high level API to decode data from 
the *current* locale encoding. These functions fail with an OSError  or 
MemoryError if decoding fails (instead of a generic ValueError), and don't 
write to stderr anymore. They are a surrogateescape argument to choose to 
escape undecodable bytes or to fail with an error.

The patch only uses the function in _localemodule.c, but other functions may 
have to be fixed to use the new function. The tzname_encoding.patch of issue 
#5905 should maybe use it for example.

----------
components: Unicode
messages: 149060
nosy: ezio.melotti, haypo, loewis
priority: normal
severity: normal
status: open
title: Add PyUnicode_DecodeLocale and PyUnicode_DecodeLocaleAndSize
versions: Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13560>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to