Re: Unicode style in win32/PythonWin

Neil Hodgson Fri, 13 Jan 2006 14:35:43 -0800

Thomas Heller:

> Hm, I don't know.  I try to avoid converting questionable characters at
> all, if possible.  Then, it seems the error-mode doesn't seem to change
> anything with "mbcs" encoding.  WinXP, Python 2.4.2 on the console:
> 
>>>> u"abc\u034adef".encode("mbcs", "ignore")
> 'abc?def'
>>>> u"abc\u034adef".encode("mbcs", "strict")
> 'abc?def'
>>>> u"abc\u034adef".encode("mbcs", "error")
> 'abc?def'
> 
> With "latin-1", it is different:


    Yes, there are no 'ignore' or 'strict' modes for mbcs. It is a 
simple call to WideCharToMultiByte with no options set. 'ignore' may 
need two calls with different values of the default character to allow 
identification and removal of default characters as any given default 
character may also appear naturally in the output. 'strict' and 'error' 
would be easier to implement by checking both the return status and 
lpUsedDefaultChar which is set when any default character insertion is done.

    The relevant code is in dist\src\Objects\unicodeobject.c.

    Neil


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Unicode style in win32/PythonWin

Reply via email to